We have the following grammar for a simple language:
stmts^: stmt+;
stmt: 'if'^ expr 'then'! stmts ('else'! stmts)? 'end'!
| 'while'^ expr 'do'! stmts 'end'!
| 'for'^ expr? ';'! expr? ';'! expr? 'do'! stmts 'done'!
| 'break'
| expr ';'!;
expr: addi (('='^|'+='^|'-='^|'*='^|'/='^) expr)?;
addi: term (('+'^|'-'^) term)*;
term: atom (('*'^|'/'^) atom)*;
atom: '-'^ atom | IDENT | NATURAL_LIT | '('! expr ')'!;
IDENT: ('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
NATURAL_LIT: ('0'..'9')+;
However, we have discovered the following problem: the expressions
- for i+=1 ; ; do ... done
- for ; ; i+=1 do ... done
generate the same AST: “
for(+=(i,1), ...)”. This happens because the
expr on the header of the
for-loop may be missing, and in such
case, nothing is included in the AST. Fix the grammar to resolve these
ambiguities by making missing expressions on the header of the
for-loop
to appear explicitly in the AST: they should be represented by a special node
named
nop (
no operation). For instance, the AST of the previous
examples should be:
- for(+=(i,1),nop,nop,...)
- for(nop,nop,+=(i,1),...)