Cminus:A Language for Practice Implementation
1 Purpose
This document describes the Cminus programming language. It is intended to provide enough detail to allow
implementation of a parser (context-free analyzer). This document contains information that is irrelevant to
the some portions of each assignment in this course. Read the programming descriptions carefully to ensure
that you understand the scope of each assignment.
2 Introduction
Cminus is a programming language designed for practice implementation. Cminus is an extremely simplified
version of C in which one may perform simple integer calculations. Cminus is intended to be simple enough
to implement in a single semester by any student willing to put in some effort. Each feature included in the
language was added specifically to illustrate some problem that arises in the design and implementation of
a very simple compiler.
Cminus supports two basic data types:integerandfloat. These types may be aggregated into a one
dimensional array. The language is intended to bestrongly typed; that is, the type of each expression
should be determinable at compile time. Since there is nobooleandata type, integers are used as logical
values. Control structures in Cminus are limited. It has anifstatement, awhilestatement and acompound
statement. Procedure abstraction is also limited. One cannot declare a procedure inside another procedure.
3 Lexical Properties of Cminus
In this section, any sequence of characters denoted inside of a pair of"’s indicates the actual string literal
found inside of the"’s.
1. In Cminus, blanks are significant.
2. In Cminus, all keywords are reserved; that is, the programmer cannot use a Cminus keyword as the
name of a variable. The valid keywords are:
〈ELSE〉 →"else"
〈EXIT〉 →"exit"
〈FLOAT〉 →"float"
〈IF〉→"if"
〈INT〉 →"int"
〈READ〉 →"read"
〈RETURN〉 →"return"
〈WHILE〉 →"while"
〈WRITE〉 →"write"
(Note that Cminus iscase sensitive,that is, the variableXdiffers fromx. Thus,ifis a keyword, but
IFcan be a variable name.)
1
3. The following special characters have meanings in a Cminus program.
〈AND〉 →"&&"
〈ASSIGN〉 →"="
〈CM〉 →","
〈DIVIDE〉 →"/"
〈EQ〉 →"=="
〈GE〉 →">="
〈GT〉 →">"
〈LBR〉 →"{"
〈LBK〉 →"["
〈LE〉 →"<="
〈LP〉 →"("
〈LT〉 →"<"
〈MINUS〉 →"-"
〈NE〉 →"!="
〈NOT〉 →"!"
〈OR〉 →"||"
〈PLUS〉 →"+"
〈RBR〉 →"}"
〈RBK〉 →"]"
〈RP〉 →")"
〈SC〉 →";"
〈DQ〉 →"\""
〈TIMES〉 →"*"
4. Comments are delimited by the characters/*and*/. A/*begins a comment; it is valid in no other
context. A*/ends a comment; it cannot appear inside a comment. Comments may appear before or
after any other token.
5. Identifiers are written with upper and lowercase letters and are defined as follows:
〈LETTER〉 →"a"|"b"|"c"| ··· |"z"|"A"|"B"| ··· |"Z"
〈DIGIT〉→"0"|"1"|"2"| ··· |"9"
〈IDENTIFIER〉 → 〈LETTER〉(〈LETTER〉 |〈DIGIT〉)
∗
The implementor may restrict the length of identifiers so long as identifiers of at least 31 characters
are legal.
6. Constants are defined as follows:
〈POSITIVE〉 →"1"|"2"|"3"|...|"9"
〈INTCON〉 → 〈POSITIVE〉〈DIGIT〉
∗
|0
Special string constants are acceptable inwritestatements:
〈STRING〉 → 〈DQ〉(∼(〈DQ〉) )
∗
〈DQ〉
4 Cminus Syntax
This section gives a syntactic description of Cminus. The sections following the grammar provide implemen-
tation notes on the various parts of the grammar.
The grammar, as stated, defines the language. It may require some massaging before implementation
with any particular parser generator system.
2
4.1 BNF
The following grammar describes the context-free syntax of Cminus:
〈Program〉→ 〈DeclList〉〈Procedures〉
| 〈Procedures〉
〈Procedures〉→ 〈ProcedureDecl〉〈Procedures〉
| 〈ProcedureDecl〉
〈ProcedureDecl〉 → 〈ProcedureHead〉〈ProcedureBody〉
〈ProcedureHead〉 → 〈FunctionDecl〉〈DeclList〉
| 〈FunctionDecl〉
〈FunctionDecl〉→ 〈Type〉〈IDENTIFIER〉〈LP〉〈RP〉〈LBR〉
〈ProcedureBody〉 → 〈StatementList〉〈RBR〉
〈DeclList〉→ 〈Type〉〈IdentifierList〉〈SC〉
| 〈DeclList〉〈Type〉〈IdentifierList〉〈SC〉
〈IdentifierList〉→ 〈VarDecl〉
| 〈IdentifierList〉〈CM〉〈VarDecl〉
〈VarDecl〉→ 〈IDENTIFIER〉
| 〈IDENTIFIER〉〈LBK〉〈INTCON〉〈RBK〉
〈Type〉→ 〈INT〉
| 〈FLOAT〉
〈StatementList〉→ 〈Statement〉
| 〈StatementList〉〈Stmt〉
〈Statement〉→ 〈Assignment〉
| 〈IfStatement〉
| 〈WhileStatement〉
| 〈IOStatement〉
| 〈ReturnStatement〉
| 〈ExitStatement〉
| 〈CompoundStatement〉
〈Assignment〉→ 〈Variable〉〈ASSIGN〉〈Expr〉〈SC〉
〈IfStatement〉→ 〈IF〉〈TestAndThen〉〈ELSE〉〈CompoundStatement〉
| 〈IF〉〈TestAndThen〉
〈TestAndThen〉→ 〈Test〉〈CompoundStatement〉
〈Test〉→(〈Expr〉)
〈WhileStatement〉 → 〈WhileToken〉〈WhileExpr〉〈Statement〉
〈WhileToken〉→ 〈WHILE〉
〈WhileExpr〉→(〈Expr〉)
〈IOStatement〉→ 〈READ〉〈LP〉〈Variable〉〈RP〉〈SC〉
| 〈WRITE〉〈LP〉〈Expr〉〈RP〉〈SC〉
| 〈WRITE〉〈LP〉〈StringConstant〉〈RP〉〈SC〉
〈ReturnStatement〉 → 〈RETURN〉〈Expr〉〈SC〉
〈ExitStatement〉→ 〈EXIT〉〈SC〉
〈CompoundStatement〉 → 〈LBR〉〈StatementList〉〈RBR〉
3
〈Expr〉→ 〈Expr〉〈AND〉〈SimpleExpr〉
| 〈Expr〉〈OR〉 〈SimpleExpr〉
| 〈SimpleExpr〉
| 〈NOT〉〈SimpleExpr〉
〈SimpleExpr〉 → 〈SimpleExpr〉〈EQ〉 〈AddExpr〉
| 〈SimpleExpr〉〈NE〉〈AddExpr〉
| 〈SimpleExpr〉〈LE〉〈AddExpr〉
| 〈SimpleExpr〉〈LT〉〈AddExpr〉
| 〈SimpleExpr〉〈GE〉〈AddExpr〉
| 〈SimpleExpr〉〈GT〉〈AddExpr〉
| 〈AddExpr〉
〈AddExpr〉 → 〈AddExpr〉〈PLUS〉〈MulExpr〉
| 〈AddExpr〉〈MINUS〉〈MulExpr〉
| 〈MulExpr〉
〈MulExpr〉 → 〈MulExpr〉〈TIMES〉〈Factor〉
| 〈MulExpr〉〈DIVIDE〉〈Factor〉
| 〈Factor〉
〈Factor〉→ 〈Variable〉
| 〈Constant〉
| 〈IDENTIFIER〉〈LP〉〈RP〉
| 〈LP〉〈Expr〉〈RP〉
〈Variable〉→ 〈IDENTIFIER〉
| 〈IDENTIFIER〉〈LBK〉〈Expr〉〈RBK〉
〈StringConstant〉 → 〈STRING〉
〈Constant〉 → 〈INTCON〉
| 〈FLOATCON〉
4.2 Section Notes
4.2.1 Declarations
Cminus has two standard types:intandfloat. Integers and floats occupy a single machine “word”. This
standard types may be composed into a structured array type. An identifier may represent one of four types
of objects:
1. an integer variable
2. an integer array
3. a floating point variable
4. a floating point array
Identifiers are declared to be variables or arrays in a variable declaration. Only singly dimensioned arrays
are permitted in Cminus. Indexing begins at 0 as in C and Java.
Example:
int x,y;
int a[15];
float vector[100];
4
4.2.2 Function Declarations
The semantics of function definition are simple. A function returns the value of the expression specified in
the firstreturnstatement that it executes.
Example:
int max (){
int a,b;
read(a); read(b);
if (a < b){
return b;
}
else{
return a;
}
}
4.2.3 Assignment Statement
The assignment statement requires that theleft hand side(the〈Variable〉non-terminal) andright hand side
(the〈Expr〉non-terminal) evaluate to have the same type. If they have different types, a context-sensitive
error has occurred.
4.2.4 If Statement
The grammar for theif-elseconstruct is written to eliminate the dangling else ambiguity. This is done
by forcing a〈CompoundStatement〉in each of the then- and else-clauses. To evaluate an if statement, the
expression is evaluated. For an integer value, Cminus defines0asfalse; any other value is equivalent totrue.
Examples:
if (c == d){d = a;}
if (b == 0){b = 2*a;}else{b = b/2;}
4.2.5 While Statement
The while statement provides a simple mechanism for iteration. Cminus’s while statement behaves like the
while statement in many other languages; it executes the statement in the loop’s body until the controlling
expression becomes false. The controlling expression will be treated as a boolean value encoded into anint
expression.
4.2.6 Input-Output Statements
Cminus provides two primitives for input and output. Thereadandwritestatements are intended to provide
direct access to primitives implemented in the target abstract machine. Note that thewritestatement
outputs a newline by default.
Examples:
read (x);
write (x+y);
write (‘‘error’’);
5
4.2.7 Return Statements
Cminus allows functions to return values. The type of the return value needs to be the same type as the
type of the function, or it must be converted to that type.
Example:
int f(){
return 7;
}
4.2.8 Exit Statement
An exit statement in Cminus exits the program completely.
4.2.9 Expressions
Cminus expressions compute simple values of typeintorfloat. For both integer and floating point numbers,
arithmetic and comparison are defined.
Coercion:If an expression contains operands of only one type, evaluation is straight forward. When an
operand contains mixed types, the situation is more complex. If anAddoporMulophas anintoperand and
afloatoperand, theintoperand should be converted to afloatbefore the operation is performed.
Relational operators always produce an integer. Comparisons between integers and floats produce integer
results. To perform the comparison, the integer is converted to a float. For the numbers, comparison is based
on both sign and magnitude.
Note: in an assignment, the value of a numeric expression gets converted to match the type of the variable
that appears on its left hand side.
Operator PrecedenceOperator precedences in Cminus are specified in the table below. Multiplication
and division have the highest priority,&&and||have the lowest.
Operator
Precedence
*, /5
+, -4
<, <=, =, >=, >, !=3
!
2
&&, ||
1
Relational operators always produce an integer.
Booleans:Because Cminus has no booleans, relational expressions are defined to yield integer results.
Thus, a relational expression of the forma == bis considered to be an arithmetic expression whose value
is0if the relation does not hold and not0otherwise. Hence, both theif-elseandwhilestatements test
integer values; the expression is consideredfalseif it evaluates to0and totrueif it evaluates to anything
else. Consider the following example which tests for either of two conditions being true:
{
read (a); read (b); read (c); read (d);
if ((a == b) + (c } 6 Note that relational expressions should be enclosed in parentheses because they have very low precedence. In the above example, the special operator||could have been used. In Cminus the operator||takes two integer operands.||produces the result0if both operands evaluate to0; otherwise, it produces1. The operator&&evaluates to1if both operands are nonzero; otherwise it evaluates to0. The unary logical operator!evaluates to1if its argument is zero and to0otherwise. Function Invocation:Cminus uses parentheses to indicate invocation and square brackets to indicate subscripting of an array. This simplifies the grammar — many languages use parentheses for both purposes. Cminus has no parameters. Global variables are required to pass values to a function. 5 An Example Program The following program represents a simple example program written in Cminus. This program successively reads pairs of integers from the input file and prints out their product. int main(){ int x, y; read(x); read(y); while ((x! =0) || (y! =0)){ write (x*y); read (x); read (y); } exit; } 7