The set of tokens of the language is
{
+,
-,
*,
/,
<,
>,
=,
not,
and,
or,
(,
),
NUMBER}.
The token
NUMBER represents unsigned integers, i.e., non-empty sequences
of digits. Examples of correct expressions are
“
-0+1<2*3”,
“
1>2/3 and 4<5*6”,
“
not -1=+15”,
“
1=2 and 0=0 and 1=1 or 3=4”,
“
not (3>4 or not 5<6)”,
whereas
“
1+”,
“
/2”, “
1 2”,
“
1<2=3”,
are not. The generated AST must correspond to an interpretation of the
following:
- Unary operators +,-, and not have the highest
precedence.
- Arithmetic operators are left-associative, with the usual precedence
rules.
- Comparison operators are not associative (e.g., “1=2=3” is not
valid) and have less precedence than arithmetic operators.
- Logical operators have the lowest precedence, with and being more
sticky than or. The resulting AST must have at most one token or
(resp. and) for each disjunction (resp. conjunction) of boolean
expressions.
Note that expressions such as “
not 1”, “
2 < (not 3=4)” and
“
5 + (6<7)” may be semantically nonsensical, but are syntactically
correct and should be recognized. As an example, for input
not 1-2 or 3<4 and 5>6 and 7=8+9
the resulting AST must be
or(-(not(1),2),and(<(3,4),>(5,6),=(7,+(8,9))))
i.e., as if the implicit parenthesization was
((not 1)-2) or ((3<4) and (5>6) and (7=(8+9)))