Note: this exercise is taken from the
Compilers subject at Barcelona School of Informatics (and is based on a similar one from
Compilers: Principles, Techniques, and Tools).
Design a grammar to describe identifiers with subscripts. The set of tokens is
{
IDENT,
NUMBER,
[,
]}. For instance, all the
following are valid inputs:
- f
- a[i]
- Matriu[arg[i]]
- a[i][j[min]]
- Vector[i[min]]
- Vector[i[13][min]]
- C[1][2][arg[3]]
- M[fab[3]][2]
Remarks:
- The token IDENT recognizes identifiers (i.e., non-empty sequences
of alphabetic characters), and the token NUMBER recognizes unsigned
numbers (i.e., non-empty sequences of digits).
- Subscripts can be identifiers or unsigned numbers. Subscripts that are
identifiers can have subscripts of their own, whereas those that are numbers
cannot. Therefore, Vector[1[i]] is not valid.
- The AST must have the main identifier as root, with one child per
subscript. An identifier subscript must have an analogous subtree.