Consider the following grammar with the following set of terminal symbols {+,-,*,/,(,),id}:
E -> E + T | E - T | T T -> T * F | T / F | F F -> ( E ) | id
The grammar can be parsed by an LL(1) parser if it does not have left recursion and no ambiguity is present (i.e., the LOOKAHEADs for all productions of each non-terminal are disjoint).
A simple inspection of the grammar shows that left recursion is present in both E and T rules. Also, there are left corners that may hide ambiguity.
The first step, then, is to rewrite the grammar so that left recursion is eliminated:
E -> T E' E' -> + T E' | - T E' | ε T -> F T' T' -> * F T' | / F T' | ε F -> ( E ) | id
The new grammar still has left corners, but a cursory inspection shows that they are not problematic. Nevertheless, they should be eliminated:
E -> ( E ) T' E' | id T' E' E' -> + T E' | - T E' | ε T -> ( E ) T' | id T' T' -> * F T' | / F T' | ε F -> ( E ) | id
FIRST(E) = FIRST(( E ) T' E') ∪ FIRST(id T' E') = {(, id} FIRST(E') = FIRST(+ T E') ∪ FIRST(- T E') ∪ {ε} = {+, -, ε} FIRST(T) = FIRST(( E ) T') ∪ FIRST(id T') = {(, id} FIRST(T') = FIRST(* F T') ∪ FIRST(/ F T') ∪ {ε} = {*, /, ε} FIRST(F) = FIRST(( E )) ∪ FIRST(id) = {(, id}
FOLLOW(E) = {$} ∪ {)} = {), $} FOLLOW(E') = FOLLOW(E) = {), $} FOLLOW(T) = FIRST(E')\{ε} ∪ FOLLOW(E') = {+, -, ), $} FOLLOW(T') = FOLLOW(T) = {+, -, ), $} FOLLOW(F) = FIRST(T')\{ε} ∪ FOLLOW(T') = {*, /, +, -, ), $}
For building the parse table, we will consider the FIRST and FOLLOW sets above.
+ | - | * | / | id | ( | ) | $ | |
---|---|---|---|---|---|---|---|---|
E | E -> id T' E' | E -> ( E ) T' E' | ||||||
E' | E' -> + T E' | E' -> - T E' | E' -> ε | E' -> ε | ||||
T | T -> id T' | T -> ( E ) T' | ||||||
T' | T' -> ε | T' -> ε | T' -> * F T' | T' -> / F T' | T' -> ε | T' -> ε | ||
F | F -> id | F -> ( E ) |
The following table show the parsing of the a*(b+c) input sequence.
STACK | INPUT | ACTION |
E$ | a*(b+c)$ | E -> id T' E' |
idT'E'$ | a*(b+c)$ | -> id ≡ a |
T'E'$ |
*(b+c)$ | T' -> * F T' |
*FT'E'$ |
*(b+c)$ | -> * |
FT'E'$ | (b+c)$ | F -> ( E ) |
(E)T'E'$ |
(b+c)$ | -> ( |
E)T'E'$ | b+c)$ | E -> id T' E' |
idT'E')T'E'$ | b+c)$ | -> id ≡ b |
T'E')T'E'$ | +c)$ | T' -> ε |
E')T'E'$ | +c)$ | E' -> + T E' |
+TE')T'E'$ | +c)$ | -> + |
TE')T'E'$ | c)$ | T -> id T' |
idT'E')T'E'$ | c)$ | -> id ≡ c |
T'E')T'E'$ | )$ | T' -> ε |
E')T'E'$ | )$ | E' -> ε |
)T'E'$ | )$ | -> ) |
T'E'$ | $ | T' -> ε |
E'$ | $ | E' -> ε |
$ | $ | ACCEPT |