Monash University
Faculty of Information Technology
2nd Semester 2020
FIT2014
Tutorial 4
Context-Free Grammars and the Pumping Lemma
Attempt all questions before the tutorial.
ASSESSED PREPARATION: Question 8.
You must submit a serious attempt at this entire question.
1. Find Regular Grammars for the following Regular Expressions:
i) (a ∪ b)∗(aa ∪ bb)
ii) ((a ∪ b)(a ∪ b))∗
iii) (aa ∪ bb)∗
iv) (ab)∗aa(ba)∗
v) (ab ∪ ba)(aa ∪ bb)∗(ab ∪ ba)
2.
Describe Pushdown Automata for each of the Regular Grammars you found in the previous
question.
3.
(a) Find a Context-Free Grammar for the language
{anbic2n : i, n ∈ N}.
(b) Prove, by induction on the string length, that every string of the form anbic2n can be
(c) Find a Pushdown Automaton that recognises this language.
4.
Consider the following four statements, which we call W , X, Y and Z.
W : Every context-free language is infinite.
X: Every context-free language is finite.
Y : There exists an infinite context-free language.
Z: There exists a finite context-free language.
1
(a)
(i). Which of X, Y , Z is the logical negation of W?
(ii). Which of W , X, Y , Z is true?
(b) Define the predicates CFL and Infinite to have the following meanings.
CFL(L) L is context-free
Infinite(L) L is infinite
The universe of discourse is all languages.
(i). Write statements W , X, Y , Z in predicate logic, using these two predicates.
(ii). Use the properties of quantifiers, and laws of propositional logic as needed, to derive an exis-
tential statement equivalent to ¬W .
(iii). Which of X,Y, Z is equivalent to your answer to (ii)?
5. Let LOL be the language generated by the following Context-Free Grammar.
S → P (1)
P → PP (2)
P → haQ (3)
Q → Qa (4)
Q → ε (5)
(a) Warm-up:
(i). What is the shortest string in LOL? Give a derivation for it.
(ii). Is the above grammar regular?
(iii). Is LOL a regular language? If so, give a regular expression for it. If not, prove it.
(iv). Is the above grammar in Chomsky Normal Form? If so, why; if not, state why not.
(b) Here is an attempted proof that LOL is infinite. Comment on what is correct and incorrect in
this proof.
1. Let x be some string in LOL.
2. The string x must have a derivation, using the above grammar.
3. The only rule in the grammar that does not have a nonterminal symbol on its
right-hand side is the last one, (5).
4. So any derivation of x must finish with an application of (5).
5. Therefore the string just before x, in the derivation of x, must have an Q.
2
6. Therefore the derivation of x has the form
S =⇒ · · · =⇒ xleftQxright (5)=⇒ xleftxright = x,
where xleft denotes the portion of x before Q, and xright denotes the portion of x
after Q.
7. So, instead of applying rule (5) to this last occurrence of Q, we could apply rule (4)
followed by rule (5), giving
S =⇒ · · · =⇒ xleftQxright (4)=⇒ xleftQaxright (5)=⇒ xleftaxright.
This new string is also in LOL and is one letter longer than x. Call it y.
8. We can apply this argument again to y, to get a string in LOL that is one letter
longer than y, and then we can get strings that are even longer than that, and so
on, indefinitely.
9. So we can generate infinitely many strings. Therefore LOL is infinite.
(c) Prove by induction that, for all n ≥ 2, the language LOL contains a string of length at least n.
(d) Here is an attempted proof by contradiction that LOL is infinite. It uses, in the middle, the
1. Assume, by way of contradiction, that LOL is finite.
2. 〈 Insert here a correct proof, based on (c), that LOL has arbitrarily long strings. 〉
3. This contradicts our assumption that LOL is finite.
4. Therefore LOL is infinite.
(e) Construct a correct proof by contradiction that LOL is infinite, using the following approach:
1. Start by assuming LOL is finite, as in (c) step 1.
2. Show that LOL contains a nonempty string. (No need to use the assumed finiteness
of LOL just yet.)
3. Among all the nonempty strings in LOL, choose x carefully . . . How should you
choose it?
4. Show how to construct a string in LOL that is longer than x.
5. . . . . . .
6. Therefore LOL is infinite.
6. This question is a sequel to Tute 3, Q6.
(a) Review Tute 3, Q6, reflecting on the discussion of it in your tutorial, and correcting your attempt
at the question.
3
(b) Use the Pumping Lemma for CFLs to prove that the language {an2 : n ∈ N} is not context-free.
(c) Hence prove that the language of binary string representations of adjacency matrices of graphs
is not context-free.
You may assume that the intersection of a context-free language with a regular language is
context-free. (Challenge: prove this.) Note that, unlike regular languages, the class of context-free
languages is not closed under intersection. (Can you prove this?)
7. Recall the Pumping Games of Tute 3. An expansion pack has now been released, called
Double Pumping Games.
You can play a Double Pumping Game for any language. The players are Con and Noni. First,
the players are given a language L (which may or may not be context-free). Con moves first, then
Noni moves, then Con moves, then Noni moves. The rules for their moves are as follows.
• Con first chooses a number k ∈ N.
• Then Noni chooses a string w ∈ L with |w| > 2k−1.
• Then Con divides w up into substrings u, v, x, y, z such that vy 6= ε and |vxy| ≤ 2k.
• Then Noni chooses a non-negative integer i.
The result of the game is that:
• if uvixyiz ∈ L, then Con wins;
• if uvixyiz 6∈ L, then Noni wins.
(a) Play the game for (i) PALINDROMES, and (ii) {anbnan : n ≥ 0}.
(b) Using quantifiers, write down the assertion that Noni has a winning strategy.
(c) Using quantifiers, write down the assertion that Con has a winning strategy.
(d) What does the Pumping Lemma for CFLs tell us about circumstances in which winning
strategies exist?
8. In the game of Cricket, a bowler “bowls” a cricket ball at a batsman. Each time this is
done, the batsman tries to hit the ball, and might score some number of runs, or the bowler might
take the wicket of the batsman (in which case the batsman has to leave, and is replaced by another
batsman). A bowler bowls the ball six times, making up an over, and then makes way for another
bowler to have their turn.
In cricket statistics, a bowler’s activity is summarised by four nonnegative integers: O, M , R,
W , where
• O is the total number of overs bowled by that bowler,
• M is the number of maiden overs by that bowler: these are those overs in which no run was
scored by batsmen,
• R is the number of runs scored from the bowling of that bowler,
• W is the number of wickets taken by that bowler.
We assume:1
1Cricket experts might notice some simplifying assumptions here . . .
4
• A batsman can score up to six runs from each ball bowled, making up to 36 runs from an over.
• At most one wicket can be taken with each ball, making up to 6 wickets per over.
We also use the fact that the number of maiden overs is clearly at most the total number of overs.
Let CricketBowlingFigures be the language of quadruples (O,M,R,W ) of nonnegative integers,
each in unary2, where O ≥ M , R ≤ 36 (O −M), and W ≤ 6O. We assume that any of these
numbers can be arbitrarily large, subject to these constraints. (The statistics can be compiled over
entire lifetimes, not just single matches, and some bowlers may be immortal.) For example,
• (
27︷ ︸︸ ︷
111 · · · 11, 1111,
72︷ ︸︸ ︷
111 · · · 11, 11) ∈ CricketBowlingFigures. It represents the bowling figures
(27,4,72,2). 3
• (11, 1,
36︷ ︸︸ ︷
111 · · · 11, 111111111111) ∈ CricketBowlingFigures. It represents the bowling figures
(2,1,36,12).
• (, , , ) ∈ CricketBowlingFigures. It represents the bowling figures (0,0,0,0) of a bowler who does
not get a turn at bowling.
• (1, 11, 111, 1111) 6∈ CricketBowlingFigures. The bowling figures (1,2,3,4) are impossible, since
O < M .
• (11, 1,
36︷ ︸︸ ︷
111 · · · 11, 1111111111111) 6∈ CricketBowlingFigures. The bowling figures (2,1,36,13)
are impossible, since W > 6O.
Is the language CricketBowlingFigures regular? Prove or disprove.
9. Let G be the following CFG.
S → AB (6)
A → AC (7)
B → CD (8)
A → a (9)
C → b (10)
D → a (11)
Use the Cocke-Younger-Kasami (CYK) algorithm to determine whether or not the string abbba is
generated by G.
10.
(a) Convert the CFG in Q5 into a Chomsky Normal Form grammar for LOL.
(b) Using this Chomsky Normal Form grammar, apply the CYK algorithm to the string hahaa.
11. Prove by induction that the CYK algorithm correctly determines whether or not a given
string x is generated by a given context-free grammar G.
More specifically:
2In unary representation, a number n is represented as a string of n 1s. For example, in unary, the number 5 is
represented by 11111, and 0 is represented by the empty string.
3Question for cricket tragics: what is the significance of these bowling figures, in the history of Test Cricket?
5
1. Prove by induction on n that, for every substring y of x, where |y| = n, the algorithm correctly
finds all nonterminals in G that can generate y.
2. Explain briefly how, if we know all the nonterminals that can generate x, we can then determine
whether x can be generated by G.
The idea of the proof is given in Lecture 13, slide 8. Your task is to avoid the gap in that sketch
proof (“Continue, in this way, . . . ”) and produce a correct proof by induction.
Supplementary exercises
12. You have seen how to construct a grammar for any regular language, using an NFA for the
language. But suppose we wanted to construct a grammar for a regular language directly from a
regular expression, and without constructing an NFA first. How would you do it? (The grammar
need not be regular.)
13.
Let k be a fixed positive integer. A k-limited Pushdown Automaton is a Pushdown Automaton
whose stack can store at most k symbols (regardless of the length of the input string).
(a) Explain how a k-limited Pushdown Automaton can be simulated by a Nondeterministic Finite
Automaton.
(b) Deduce that a language is recognised by a k-limited Pushdown Automaton if and only if it
is regular.
14. In Australian Rules Football, a team gains points by kicking goals and behinds. A goal is
worth 6 points, and a behind is worth one point. The winner is the team with the greatest number
of points at the end of the game.
A team’s score is given as a triple of numbers (g, b, p), where g is the number of goals, b is the
number of behinds, and p is the number of points. These numbers must be nonnegative and satisfy
6g + b = p. For example, (1, 2, 8) is a valid score, since 6× 1 + 2 = 8.
Let FootyScore be the language of valid football scores (g, b, p), where each number is represented
in unary, and the alphabet consists of 1, the two parentheses, and the comma. For example,
(1,11,11111111) ∈ FootyScore.
In this language, there is no upper limit on the scores.
(a) Is the language FootyScore regular? Prove or disprove.
(b) Is the language FootyScore context-free? Prove or disprove.
6

Email:51zuoyejun

@gmail.com