Note: this exercise is taken from the
Compilers subject at Barcelona School of Informatics.
Design a grammar to describe hierarchic graphs. The set of tokens is
{
graph,
endgraph,
IDENT,
(,
),
,,
;,
>}. The token
IDENT refers to identifiers, such as
graph names, i.e., non-empty sequences of alphanumeric characters and
underscore, not starting by a digit.
Example input:
graph F(in, out)
in > a > out;
in > b > a;
b > out;
endgraph
graph G(a, b, c, d)
a > x > y > x;
b > x; y > c; y > d;
endgraph
graph top(x1, x2, y)
x1 > n1; x1 > n2;
G(n1, n2, n8, n5);
G(n3, n4, n6, n7);
x2 > n3; x2 > n4;
F(n7, n11);
G(n5, n6, n9, n10);
G(n8, n9, n12, n13);
G(n10, n11, n14, n15);
n13 > y; n14 > y;
n15 > n12;
endgraph
Remarks:
- The grammar must have a special node named global as root.
- Each input contains at least one graph.
- The set of parameters of a graph can be empty, as in “graph H() ...”.
- The body of a graph can be empty, as in “graph H(a,b) endgraph”.
- The AST corresponding to a graph must have the token graph as
root, the graph name as first child, a special node named params as
second child, with one child per parameters, and a special node named
body as third child, with one child per edge definition.
- An edge definition of the form x > y > z must be represented by a
subtree with the token > as root, and one child per node. An edge
definition consisting of an embedded subgraph must be represented by a subtree
with the token ( as root, the graph name as first child and a special
node named params as second child, with one child per parameter.