What is a Grammar?
An unrestricted grammar is a quadruple , where is an alphabet; is the set of terminal symbols (); is the set of non-terminal symbols; is the initial symbol; and is a set of rules (a finite subset of ).
The following are defined:
- Direct derivation:
- Derivation:
- Generated language:
The FIRST and FOLLOW sets
Computing the FIRST Set
The FIRST set for a given string or symbol can be computed as follows:
- If a is a terminal symbol, then FIRST(a) = {a}
- If X is a non-terminal symbol and X -> ε is a production then add ε to FIRST(X)
- If X is a non-terminal symbol and X -> Y1...Yn is a production, then
- a ∈ FIRST(X) if a ∈ FIRST(Yi) and ε ∈ FIRST(Yj), i>j (i.e., Yj ε)
As an example, consider production X -> Y1...Yn
- If Y1 ε then FIRST(X) = FIRST(Y1)
- If Y1 ε and Y2 ε then FIRST(X) = FIRST(Y1) \ {ε} ∪ FIRST(Y2)
- If Yi ε (∀i) then FIRST(X) = ∪i(FIRST(Yi)\{ε}) ∪ {ε}
The FIRST set can also be computed for a string Y1...Yn much in the same way as in case 3 above.
Computing the FOLLOW Set
The FOLLOW set is computed for non-terminals and indicates the set of terminal symbols that are possible after a given non-terminal. The special symbol $ is used to represent the end of phrase (end of input).
- If X is the grammar's initial symbol then {$} ⊆ FOLLOW(X)
- If A -> αXβ is a production, then FIRST(β)\{ε} ⊆ FOLLOW(X)
- If A -> αX or A -> αXβ (β ε), then FOLLOW(A) ⊆ FOLLOW(X)
The algorithm should be repeated until the FOLLOW set remains unchanged.
Exercises