Monash University

Faculty of Information Technology

2nd Semester 2020

FIT2014

Tutorial 4

Context-Free Grammars and the Pumping Lemma

Attempt all questions before the tutorial.

ASSESSED PREPARATION: Question 8.

You must submit a serious attempt at this entire question.

1. Find Regular Grammars for the following Regular Expressions:

i) (a ∪ b)∗(aa ∪ bb)

ii) ((a ∪ b)(a ∪ b))∗

iii) (aa ∪ bb)∗

iv) (ab)∗aa(ba)∗

v) (ab ∪ ba)(aa ∪ bb)∗(ab ∪ ba)

2.

Describe Pushdown Automata for each of the Regular Grammars you found in the previous

question.

3.

(a) Find a Context-Free Grammar for the language

{anbic2n : i, n ∈ N}.

(b) Prove, by induction on the string length, that every string of the form anbic2n can be

generated by your grammar.

(c) Find a Pushdown Automaton that recognises this language.

4.

Consider the following four statements, which we call W , X, Y and Z.

W : Every context-free language is infinite.

X: Every context-free language is finite.

Y : There exists an infinite context-free language.

Z: There exists a finite context-free language.

1

(a)

(i). Which of X, Y , Z is the logical negation of W?

(ii). Which of W , X, Y , Z is true?

(b) Define the predicates CFL and Infinite to have the following meanings.

CFL(L) L is context-free

Infinite(L) L is infinite

The universe of discourse is all languages.

(i). Write statements W , X, Y , Z in predicate logic, using these two predicates.

(ii). Use the properties of quantifiers, and laws of propositional logic as needed, to derive an exis-

tential statement equivalent to ¬W .

(iii). Which of X,Y, Z is equivalent to your answer to (ii)?

5. Let LOL be the language generated by the following Context-Free Grammar.

S → P (1)

P → PP (2)

P → haQ (3)

Q → Qa (4)

Q → ε (5)

(a) Warm-up:

(i). What is the shortest string in LOL? Give a derivation for it.

(ii). Is the above grammar regular?

(iii). Is LOL a regular language? If so, give a regular expression for it. If not, prove it.

(iv). Is the above grammar in Chomsky Normal Form? If so, why; if not, state why not.

(b) Here is an attempted proof that LOL is infinite. Comment on what is correct and incorrect in

this proof.

1. Let x be some string in LOL.

2. The string x must have a derivation, using the above grammar.

3. The only rule in the grammar that does not have a nonterminal symbol on its

right-hand side is the last one, (5).

4. So any derivation of x must finish with an application of (5).

5. Therefore the string just before x, in the derivation of x, must have an Q.

2

6. Therefore the derivation of x has the form

S =⇒ · · · =⇒ xleftQxright (5)=⇒ xleftxright = x,

where xleft denotes the portion of x before Q, and xright denotes the portion of x

after Q.

7. So, instead of applying rule (5) to this last occurrence of Q, we could apply rule (4)

followed by rule (5), giving

S =⇒ · · · =⇒ xleftQxright (4)=⇒ xleftQaxright (5)=⇒ xleftaxright.

This new string is also in LOL and is one letter longer than x. Call it y.

8. We can apply this argument again to y, to get a string in LOL that is one letter

longer than y, and then we can get strings that are even longer than that, and so

on, indefinitely.

9. So we can generate infinitely many strings. Therefore LOL is infinite.

(c) Prove by induction that, for all n ≥ 2, the language LOL contains a string of length at least n.

(d) Here is an attempted proof by contradiction that LOL is infinite. It uses, in the middle, the

proof given in (c). Comment on what is good and bad about this proof.

1. Assume, by way of contradiction, that LOL is finite.

2. 〈 Insert here a correct proof, based on (c), that LOL has arbitrarily long strings. 〉

3. This contradicts our assumption that LOL is finite.

4. Therefore LOL is infinite.

(e) Construct a correct proof by contradiction that LOL is infinite, using the following approach:

1. Start by assuming LOL is finite, as in (c) step 1.

2. Show that LOL contains a nonempty string. (No need to use the assumed finiteness

of LOL just yet.)

3. Among all the nonempty strings in LOL, choose x carefully . . . How should you

choose it?

4. Show how to construct a string in LOL that is longer than x.

5. . . . . . .

6. Therefore LOL is infinite.

6. This question is a sequel to Tute 3, Q6.

(a) Review Tute 3, Q6, reflecting on the discussion of it in your tutorial, and correcting your attempt

at the question.

3

(b) Use the Pumping Lemma for CFLs to prove that the language {an2 : n ∈ N} is not context-free.

(c) Hence prove that the language of binary string representations of adjacency matrices of graphs

is not context-free.

You may assume that the intersection of a context-free language with a regular language is

context-free. (Challenge: prove this.) Note that, unlike regular languages, the class of context-free

languages is not closed under intersection. (Can you prove this?)

7. Recall the Pumping Games of Tute 3. An expansion pack has now been released, called

Double Pumping Games.

You can play a Double Pumping Game for any language. The players are Con and Noni. First,

the players are given a language L (which may or may not be context-free). Con moves first, then

Noni moves, then Con moves, then Noni moves. The rules for their moves are as follows.

• Con first chooses a number k ∈ N.

• Then Noni chooses a string w ∈ L with |w| > 2k−1.

• Then Con divides w up into substrings u, v, x, y, z such that vy 6= ε and |vxy| ≤ 2k.

• Then Noni chooses a non-negative integer i.

The result of the game is that:

• if uvixyiz ∈ L, then Con wins;

• if uvixyiz 6∈ L, then Noni wins.

(a) Play the game for (i) PALINDROMES, and (ii) {anbnan : n ≥ 0}.

(b) Using quantifiers, write down the assertion that Noni has a winning strategy.

(c) Using quantifiers, write down the assertion that Con has a winning strategy.

(d) What does the Pumping Lemma for CFLs tell us about circumstances in which winning

strategies exist?

8. In the game of Cricket, a bowler “bowls” a cricket ball at a batsman. Each time this is

done, the batsman tries to hit the ball, and might score some number of runs, or the bowler might

take the wicket of the batsman (in which case the batsman has to leave, and is replaced by another

batsman). A bowler bowls the ball six times, making up an over, and then makes way for another

bowler to have their turn.

In cricket statistics, a bowler’s activity is summarised by four nonnegative integers: O, M , R,

W , where

• O is the total number of overs bowled by that bowler,

• M is the number of maiden overs by that bowler: these are those overs in which no run was

scored by batsmen,

• R is the number of runs scored from the bowling of that bowler,

• W is the number of wickets taken by that bowler.

We assume:1

1Cricket experts might notice some simplifying assumptions here . . .

4

• A batsman can score up to six runs from each ball bowled, making up to 36 runs from an over.

• At most one wicket can be taken with each ball, making up to 6 wickets per over.

We also use the fact that the number of maiden overs is clearly at most the total number of overs.

Let CricketBowlingFigures be the language of quadruples (O,M,R,W ) of nonnegative integers,

each in unary2, where O ≥ M , R ≤ 36 (O −M), and W ≤ 6O. We assume that any of these

numbers can be arbitrarily large, subject to these constraints. (The statistics can be compiled over

entire lifetimes, not just single matches, and some bowlers may be immortal.) For example,

• (

27︷ ︸︸ ︷

111 · · · 11, 1111,

72︷ ︸︸ ︷

111 · · · 11, 11) ∈ CricketBowlingFigures. It represents the bowling figures

(27,4,72,2). 3

• (11, 1,

36︷ ︸︸ ︷

111 · · · 11, 111111111111) ∈ CricketBowlingFigures. It represents the bowling figures

(2,1,36,12).

• (, , , ) ∈ CricketBowlingFigures. It represents the bowling figures (0,0,0,0) of a bowler who does

not get a turn at bowling.

• (1, 11, 111, 1111) 6∈ CricketBowlingFigures. The bowling figures (1,2,3,4) are impossible, since

O < M .

• (11, 1,

36︷ ︸︸ ︷

111 · · · 11, 1111111111111) 6∈ CricketBowlingFigures. The bowling figures (2,1,36,13)

are impossible, since W > 6O.

Is the language CricketBowlingFigures regular? Prove or disprove.

9. Let G be the following CFG.

S → AB (6)

A → AC (7)

B → CD (8)

A → a (9)

C → b (10)

D → a (11)

Use the Cocke-Younger-Kasami (CYK) algorithm to determine whether or not the string abbba is

generated by G.

10.

(a) Convert the CFG in Q5 into a Chomsky Normal Form grammar for LOL.

(b) Using this Chomsky Normal Form grammar, apply the CYK algorithm to the string hahaa.

11. Prove by induction that the CYK algorithm correctly determines whether or not a given

string x is generated by a given context-free grammar G.

More specifically:

2In unary representation, a number n is represented as a string of n 1s. For example, in unary, the number 5 is

represented by 11111, and 0 is represented by the empty string.

3Question for cricket tragics: what is the significance of these bowling figures, in the history of Test Cricket?

5

1. Prove by induction on n that, for every substring y of x, where |y| = n, the algorithm correctly

finds all nonterminals in G that can generate y.

2. Explain briefly how, if we know all the nonterminals that can generate x, we can then determine

whether x can be generated by G.

The idea of the proof is given in Lecture 13, slide 8. Your task is to avoid the gap in that sketch

proof (“Continue, in this way, . . . ”) and produce a correct proof by induction.

Supplementary exercises

12. You have seen how to construct a grammar for any regular language, using an NFA for the

language. But suppose we wanted to construct a grammar for a regular language directly from a

regular expression, and without constructing an NFA first. How would you do it? (The grammar

need not be regular.)

13.

Let k be a fixed positive integer. A k-limited Pushdown Automaton is a Pushdown Automaton

whose stack can store at most k symbols (regardless of the length of the input string).

(a) Explain how a k-limited Pushdown Automaton can be simulated by a Nondeterministic Finite

Automaton.

(b) Deduce that a language is recognised by a k-limited Pushdown Automaton if and only if it

is regular.

14. In Australian Rules Football, a team gains points by kicking goals and behinds. A goal is

worth 6 points, and a behind is worth one point. The winner is the team with the greatest number

of points at the end of the game.

A team’s score is given as a triple of numbers (g, b, p), where g is the number of goals, b is the

number of behinds, and p is the number of points. These numbers must be nonnegative and satisfy

6g + b = p. For example, (1, 2, 8) is a valid score, since 6× 1 + 2 = 8.

Let FootyScore be the language of valid football scores (g, b, p), where each number is represented

in unary, and the alphabet consists of 1, the two parentheses, and the comma. For example,

(1,11,11111111) ∈ FootyScore.

In this language, there is no upper limit on the scores.

(a) Is the language FootyScore regular? Prove or disprove.

(b) Is the language FootyScore context-free? Prove or disprove.

6

欢迎咨询51作业君