280. Properties of Context-Free Languages¶
280.1. Properties of Context-Free Languages¶
Note
We will not cover Theorem 8.2
Which of the following languages are CFL?
L={anbncj∣0<n≤j}
Note
NOT CFL
L={anbjanbj∣n>0,j>0}
Note
NOT CFL
L={anbjakbp∣n+j≤k+p,n>0,j>0,k>0,p>0}
Note
CFL
L={anbjajbn∣n>0,j>0}
Note
CFL
Pumping Lemma for Regular Languages
Let L be a regular language, then there is a constant m such that w∈L, |w|≥m, w=xyz such that
|xy|≤m
|y|≥1
for all i≥0,xyiz∈L
We were able to show that anbn is not a regular language.
Pumping Lemma for CFL’s
Let L be any infinite CFL. Then there is a constant m depending only on L, such that for every string w in L, with |w|≥m, we may partition w=uvxyz such that:
|vxy|≤m, (limit on size of substring)|vy|≥1, (v and y not both empty)For all i≥0,uvixyiz∈L
Proof: (sketch)
There is a CFG G such that L=L(G).
Consider the parse tree of a long string in L.
For any long string, some nonterminal N must appear twice in the path.
QED
Example 1
Consider L={anbncn:n≥1}. Show L is not a CFL.
Proof: (by contradiction)
Assume L is a CFL and apply the pumping lemma.Let m be the constant in the pumping lemma and consider w=ambmcm. Note |w|≥m.Show there is no division of w into uvxyz such that |vy|≥1, |vxy|≤m, and uvixyiz∈L for i=0,1,2,….Case 1: Neither v nor y can contain 2 or more distinct symbols. If v contains a ‘s and b ‘s, then uv2xy2z∉L since there will be b ‘s before a ‘s.Thus, v and y can be only a ‘s, b ‘s, or c ‘s (not mixed).Case 2: v=at1, then y=at2 or bt3(|vxy|≤m)If y=at2, then uv2xy2z=am+t1+t2bmcm∉L since t1+t2>0,n(a)>n(b) (number of a ‘s is greater than number of b ‘s)If y=bt3, then uv2xy2z=am+t1bm+t3cm∉L since t1+t3>0, either n(a)>n(c) or n(b)>n(c).Case 3: v=bt1, then y=bt2 or ct3.If y=bt2, then uv2xy2z=ambm+t1+t2cm∉L since t1+t2>0,n(b)>n(a).If y=ct3, then uv2xy2z=ambm+t1cm+t3∉L since t1+t3>0, either n(b)>n(a) or n(c)>n(a).Case 4: v=ct1, then y=ct2.Then, uv2xy2z=ambmcm+t1+t2∉L since t1+t2>0,n(c)>n(a).Thus, there is no breakdown of w into uvxyz such that |vy|≥1, |vxy|≤m and for all i≥0, uvixyiz is in L.This is a contradiction, thus, L is not a CFL. Q.E.D.
Example 2
Why would we want to recognize a language of the type {anbncn:n≥1}?
Recognize underlined words:
word_ is stored as wordββββ _ _ _ _ where β represents a backspace.
Example 3
Consider L={anbncp:p>n>0}. Show that L is not a CFL.
Proof:
Assume L is a CFL and apply the pumping lemma. Let m be the constant in the pumping lemma and consider w=ambmcm+1. Note: |w|≥m.Show that there is no division of w into uvxyz such that |vy|≥1, |vxy|≤m, and uvixyiz∈L for i=0,1,2,….Case 1: Neither v nor y can contain 2 or more distinct symbols. If v contains a’s and b’s, then uv2xy2z∉L since there will be b’s before a’s.Thus, v and y can be only a’s, b’s, or c’s (not mixed).Case 2: v=at1, then y=at2 or bt3(|vxy|≤m)If y=at2, then uv2xy2z=am+t1+t2bmcm+1∉L since t1+t2≥1,n(a)>n(b).If y=bt3, then uv2xy2z=am+t1bm+t3cm+1∉L since t1+t3≥1, either n(c) is not >n(a) or n(c) is not >n(b).Case 3: v=bt1, then y=bt2 or ct3.If y=bt2, then uv2xy2z=ambm+t1+t2cm+1∉L since t1+t2≥1,n(b)>n(a).If y=ct3, then uv0xy0z=ambm−t1cm+1−t3∉L since t1+t3≥1, either n(b)<n(a) or n(c) is not >n(a).Case 4: v=ct1, then y=ct2.Then, uv0xy0z=ambmcm+1−t1−t2∉L since t1+t2≥1,n(c) is not >n(a).Thus, there is no breakdown of w into uvxyz such that |vy|≥1,|vxy|≤m and for all i≥0,uvixyiz is in L.Contradiction, thus, L is not a CFL. Q.E.D.
Example 4
Consider L={ajbk:k=j2}. Show L is not a CFL.
Proof:
Assume L is a CFL and apply the pumping lemma. Let m be the constant in the pumping lemma and consider w=ambm2.Show there is no division of w into uvxyz such that |vy|≥1,|vxy|≤m, and uvixyiz∈L for i=0,1,2,….Case 1: Neither v nor y can contain 2 or more distinct symbols. If v contains a’s and b’s, then uv2xy2z∉L since there will be b’s before a’s.Thus, v and y can be only a’s, and then b’s (not mixed).Case 2: v=at1, then y=at2 or bt3.If y=at2, then uv2xy2z=am+t1+t2bm2∉L since 0<t1+t2≤m, not enough b’s.If y=bt3, then uv2xy2z=am+t1bm2+t3∉L since 0<t1+t3≤m, if t1=0, too many b’s. If t1=1, (m+1)2=m2+2m+1, so for t1≥1, there will be too few b’s.Case 3: v=bt1, then y=bt2.Then, uv2xy2z=ambm2+t1+t2∉L since t1+t2>0, not enough a’s.Thus, there is no breakdown of w into uvxyz such that |vy|≥1, |vxy|≤m and for all i≥0,uvixyiz is in L.Contradiction, thus, L is not a CFL. Q.E.D.
Exercise
Prove the following is not a CFL by applying the pumping lemma. (Answer is at the end of this module).
L={a2nb2mcndm:n,m≥0}.
Example 5
Consider L={wˉww:w∈Σ∗},Σ={a,b}, where ˉw is the string w with each occurence of a replaced by b and each occurence of b replaced by a. For example, w=baaa,ˉw=abbb,wˉw=baaaabbb. Show L is not a CFL.
- Proof:
- Assume L is a CFL and apply the pumping lemma. Let m be the constant in the pumping lemma and consider w=ambmam.Show there is no division of w into uvxyz such that |vy|≥1, |vxy|≤m, and uvixyiz∈L for i=0,1,2,….(Note: v and y could be mixed a’s and b’s and still be in the language).(Note: I will use a to represent the a’s in the front of w and A to represent the a’s at the end of w when it is not clear.)Case 1: v=at1, then y=at2,bt3 or at2bt3.If y=at2, then uv2xy2z=am+t1+t2bmam∉L since t1+t2>0, more a’s on left than on right.If y=bt3, then uv2xy2z=am+t1bm+t3am∉L since t1+t3>0, either more a’s on left than on right, or more b’s than a’s on the right.If y=at2bt3, then uv0xy0z=am−t1−t2bm−t3am∉L since t1+t2+t3>0, either number of a’s on left and right not equal, or number of b’s not equal to number of a’s on right.Case 2: v=at1bt2, then y=bt3. Then uv0xy0z=am−t1bm−t2−t3am∉L since t1+t2+t3>0, either number of a’s on left and right not equal, or number of b’s not equal to number of a’s on right.Case 3: v=bt1, then y=bt2,At3, or bt2At3.If y=bt2, then uv2xy2z=ambm+t1+t2am∉L since t1+t2>0, n(b)≠n(a) on either side.If y=At3, then uv2xy2z=ambm+t1am+t3∉L since t1+t3>0, either n(a) on left and right not equal, or n(b) not equal to number of a’s on left.If y=bt2At3, then uv0xy0z=ambm−t1−t2am−t3∉L since t1+t2+t3>0, either n(a) on left and right not equal, or n(b) not equal to number of a’s on left.Case 4: v=bt1At2, then y=At3.Then uv0xy0z=ambm−t1am−t2−t3∉L since t1+t2+t3>0, either n(a) on left and right not equal, or n(b) not equal to number of a’s on left.Case 5: v=At1, then y=At2.Then uv0xy0z=ambmam−t1−t2∉L since t1+t2>0, n(a) on left not equal to n(a) on right.Thus, there is no breakdown of w into uvxyz such that |vy|≥1, |vxy|≤m and for all i≥0, uvixyiz is in L.Contradiction, thus, L is not a CFL. Q.E.D.
Example 6
Consider L={anbpbpan}. L is a CFL. The pumping lemma should apply!
Let m≥4 be the constant in the pumping lemma. Consider w=ambmbmam.
We can break w into uvxyz, with:
u=ambm−2v=bx=bby=bz=bm−2am
Thus, |vy|≥1,|vxy|≤m, and for all i≥0,uvixyiz=ambm+ibm+iam∈L.
If you apply the pumping lemma to a CFL, then you should find a partition of w that works!
Note
This is not a proof that a language is CFL! Why not?
280.1.1. (Chap 8.2) Closure Properties of CFLs¶
Example: L={anbn|n>0}, L∘L={anbnambm|n>0,m>0}
Theorem 1
CFL’s are closed under union, concatenation, and star-closure.
Proof:
Given 2 CFG G1=(V1,T1,S1,P1) and G2=(V2,T2,S2,P2)
* Union:Construct G3 such that L(G3)=L(G1)∪L(G2).G3=(V3,T3,S3,P3)V3=V1∪V2∪{S3},T3=T1∪T2, and P3=P1∪P2∪{S3→S1|S2}.* Concatenation:Construct G3 such that L(G3)=L(G1)∘L(G2).G3=(V3,T3,S3,P3)V3=V1∪V2∪{S3},T3=T1∪T2, and P3=P1∪P2∪{S3→S1S2}.* Star-ClosureConstruct G3 such that L(G3)=L(G1)∗G3=(V3,T3,S3,P3)V3=V1∪{S3},T3=T1, and P3=P1∪P2∪{S3→S1S3|λ}.QED.
Theorem 2
CFL’s are NOT closed under intersection and complementation.
Proof:
* Intersection:Let L1={anbncm|n,m>0} and L2={anbmcm|n,m>0}L1 and L2 are CFLsThen L1∩L2={anbncn|n>0} is not CFL.* Complementation:Set identity:L1∩L2=¯¯L1∪¯L2Note
Show Venn Diagram!
Thus, CFLs are not closed under complementation.
Note
Example for theorem below:
Theorem 3
CFL’s are closed under regular intersection. If L1 is CFL and L2 is regular, then L1∩L2 is CFL.
Proof: (sketch)
This proof is similar to the construction proof in which we showed regular languages are closed under intersection. We take a NPDA for L1 and a DFA for L2 and construct a NPDA for L1∩L2.M1=(Q1,Σ,Γ,δ1,q0,z,F1) is an NPDA such that L(M1)=L1.M2=(Q2,Σ,δ2,q′0,F2) is a DFA such that L(M2)=L2.Construct NPDA M3=(Q3,Σ,Γ,δ3,(q0,q′0),z,F3) where Q3=Q1×Q2, and F3={(q,p)|q∈F1,p∈F2}.Example of replacing arcs (NOT a Proof!):Note this is not a proof, but sketches how we will combine the DFA and NPDA. We must formally define δ3. If
(qk,x)∈δ1(qi,a,b)δ2(q′j,a)=q′lthen
((qk,q′l),x)∈δ3((qi,q′j),a,b)Must show
((q0,q′0),w,z)∗⊢((qi,q′j),λ,x)(qi,q′j)∈F3if and only if
(q0,w,z)∗⊢(qi,λ,x)(q′0,w)∗⊢(q′j,λ)qi∈F1 and q′j∈F2Must show:
w in L(M_3)` iff w∈L(M1) and w∈L(M2).QED.
NOTE: Why doesn’t this proof work for if we try to construct an NPDA that represents the intersection of two NPDA’s? Need 2 stacks.
Questions about CFL:
1. Decide if CFL is empty?Know how to get rid of useless variables and productions, if there is anything left, then CFL is not empty.2. Decide if CFL is infinite?Get rid of useless variables and productions, λ-rules, and unit productions. Then if there is a variable that repeats A∗⇒xAy, then L is infinite.Note
What type of language is a grammar that has this property? How do we recognize it automatically?
Example 7
Consider L={a2nb2mcndm:n,m≥0}. Show L is not a CFL.
- Proof:
- Assume L is a CFL and apply the pumping lemma. Let m be the constant in the pumping lemma and consider w=a2mb2mcmdm.Show there is no division of w into uvxyz such that |vy|≥1, |vxy|≤m, and uvixyiz∈L for i=0,1,2,….Case 1: Neither v nor y can contain 2 or more distinct symbols. If v contains a’s and b’s, then uv2xy2z∉L since there will be b’s before a’s.Thus, v and y can be only a’s, b’s, c’s, or d’s (not mixed).Case 2: v=at1, then y=at2 or bt3(|vxy|≤m$).If y=at2, then uv2xy2z=a2m+t1+t2b2mcmdm∉L since t1+t2>0, the number of a’s is not twice the number of c’s.If y=bt3, then uv2xy2z=a2m+t1b2m+t3cmdm∉L since t1+t3>0, either the number of a’s (n(a)) is not twice n(c) or n(b) is not twice n(d).Case 3: v=bt1, then y=bt2 or ct3.If y=bt2, then uv2xy2z=a2mb2m+t1+t2cmdm∉L since t1+t2>0,n(b)>2∗n(d).If y=ct3, then uv2xy2z=a2mb2m+t1cm+t3dm∉L since t1+t3>0, either n(b)>2∗n(d) or 2∗n(c)>n(a).Case 4: v=ct1, then y=ct2 or dt3.If y=ct2, then uv2xy2z=a2mb2mcm+t1+t2dm∉L since t1+t2>0,2∗n(c)>n(a).If y=dt3, then uv2xy2z=a2mb2mcm+t1dm+t3∉L since t1+t3>0, either :math:2*n(c) > n(a)` or 2∗n(d)>n(b).Case 5: v=dt1, then y=dt2.Then uv2xy2z=a2mb2mcmdm+t1+t2∉L since t1+t2>0,2∗n(d)>n(c).Thus, there is no breakdown of w into uvxyz such that |vy|≥1, |vxy|≤m and for all i ge0, uvixyiz is in L.Contradiction, thus, L is not a CFL. Q.E.D.
The following “adversary game” is similar to the one used to help prove that a grammar is not regular.