2020/11/1 CS 241 — Fall 2020 — Assignment 6 https://student.cs.uwaterloo.ca/~cs241/a6/ 1/4 CS 241 — Fall 2020 — Assignment 6 Assignments for CS 241 ← Assignment 5 Assignment 6 Assignment 7 → Friday, October 30th at 5:00 pm Monday, November 2nd at 5:00 pm Friday, November 13th at 5:00 pm Friday, November 20th at 5:00pm P1 • P2 • P3 • P4 • P5 In this assignment, you will write a general LR(1) parser, and then specialize it to obtain a parser for WLP4. You do not need to generate LR(1) DFAs, as these will be given to you along with the grammar as input to your program. Your task is just to implement the LR(1) parsing algorithm. Marmoset Notes For each question, Marmoset has public and release tests, and often has secret tests as well. When you release test a submission, Marmoset will only report on public and release tests. No information about secret tests will be visible until after the assignment deadline. Marmoset tests are not available yet. When they are released, a post will be made on Piazza. The mark breakdown for the assignment problems will be released at the same time. Problem 1 (filename: a6p1.cfg-r) Write a .cfg-r file which solves A5P6. You may find the cs241.cfgrl tool helpful for checking your solutions, which takes a .cfg-r file and produces a .cfg file with a forward leftmost derivation equivalent to the .cfg-r file's reversed rightmost derivation. Problem 2 (filename: a6p2.cfg-r) Write a .cfg-r file that solves A5P7. Problem 3 (filename: lr.cc or lr.rkt) Write a Racket or C++ program that reads an LR1 File representing a context-free grammar, an LR(1) DFA, and a sequence to be recognized. If the sequence is in the language, output a derivation for the sequence using the unindented reversed rightmost derivation format described on the CFG-R page. 2020/11/1 CS 241 — Fall 2020 — Assignment 6 https://student.cs.uwaterloo.ca/~cs241/a6/ 2/4 If the input is not recognized, output "ERROR at k" (followed by a single newline character) to standard error, where k is one greater than the number of tokens in the longest correct prefix of the input sequence. The longest correct prefix of an input sequence is the longest prefix which could possibly be a prefix of a string generated by the grammar. See the example below. Caution: In previous assignments, Marmoset would ignore debugging output on standard error and just look for the string "ERROR". However, for this assignment, standard error must have the exact format specified above with no extraneous output. For the sample LR1 file as input, the correct output is: term id expr term term id expr term term ( expr ) expr expr - term term id expr expr - term S BOF expr EOF If you replace the last line of this file by: BOF id - id ) - id EOF the correct output (to standard error) is: ERROR at 5 In this case, the longest correct prefix is BOF id - id since this prefix could possibly be extended into a string generated by the grammar, such as BOF id - id EOF. The next longest prefix is BOF id - id ) which cannot be extended into a string generated by the grammar, since the ) can never be matched. Since the longest corrext prefix has 4 tokens, the output is "ERROR at 5". Your program should only take one sequence as input, but this sequence could be split over multiple lines. For example, if the last line of the sample LR1 file was replaced with the following sequence of lines: BOF id - ( id ) - id EOF Then the correct output would be the same as the example shown above. The test inputs on Marmoset will use Unix-style newlines, with no extraneous whitespace at the end of a line, and the last line in the file shall be terminated with a newline. You may test your program with the WLP4 grammar and parse table, in .lr1 format after adding a sequence to be recognized. (Note: if your browser has a automatic translation feature, disable it, 2020/11/1 CS 241 — Fall 2020 — Assignment 6 https://student.cs.uwaterloo.ca/~cs241/a6/ 3/4 as it may corrupt this file. Or simply download the file instead of copy-pasting it.) You can also use the tool cs241.slr to generate the LR(1) DFA for any SLR(1) grammar. You can use it to generate LR(1) DFAs for your own grammars, allowing you to create additional test inputs to this problem. cs241.slr expects a CFG file file on standard input, and outputs an LR1 file on standard output. Problem 4 (filename: wlp4parse.cc or wlp4parse.rkt) Modify your solution to Problem 3 to always use the WLP4 grammar and parse table. It should accept as input the output from Assignment 5 Problem 1 and have the same output as Problem 3, except that each terminal is printed with its lexeme at the appropriate point in the derivation. For example, if the input to the program is (note the lack of BOF and EOF): INT int WAIN wain LPAREN ( INT int ID foo COMMA , INT int ID bar RPAREN ) LBRACE { RETURN return NUM 42 SEMI ; RBRACE } The output should be: BOF BOF INT int WAIN wain LPAREN ( INT int type INT ID foo dcl type ID COMMA , INT int type INT ID bar dcl type ID RPAREN ) LBRACE { dcls statements RETURN return NUM 42 factor NUM term factor expr term SEMI ; RBRACE } main INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE procedures main EOF EOF start BOF procedures EOF 2020/11/1 CS 241 — Fall 2020 — Assignment 6 https://student.cs.uwaterloo.ca/~cs241/a6/ 4/4 Errors should be handled in the same way as Problem 3. Since BOF is not present in the input, it is not counted as a token when determining the length of the longest correct prefix. Problem 5 (filename: wlp4parse.cc or wlp4parse.rkt) Modify your output to Problem 4 to follow the WLP4I format. The format of a .wlp4i file represents the derivation as well as the tokens in the program. You should make a parse tree and print it using a preorder traversal to achieve this. You can use wlp4scan to help generate inputs to your program, and wlp4parse and wlp4icheck to help check your output. For example, when using the input from the previous example, the corresponding output is: start BOF procedures EOF BOF BOF procedures main main INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE INT int WAIN wain LPAREN ( dcl type ID type INT INT int ID foo COMMA , dcl type ID type INT INT int ID bar RPAREN ) LBRACE { dcls statements RETURN return expr term term factor factor NUM NUM 42 SEMI ; RBRACE } EOF EOF Errors should be handled in the same way as Problem 4. Click here to return to the top of the page.
欢迎咨询51作业君