Line 66: | Line 66: | ||
* [[Introduction to Syntax/Exercise 1|Exercise 1]] - simple ambiguous grammar. | * [[Introduction to Syntax/Exercise 1|Exercise 1]] - simple ambiguous grammar. | ||
− | [[category: | + | [[category:Compiladores]] |
− | [[category: | + | [[category:Ensino]] |
[Expand] Compiladores |
---|
An unrestricted grammar is a quadruple , where is an alphabet; is the set of terminal symbols (); is the set of non-terminal symbols; is the initial symbol; and is a set of rules (a finite subset of ).
The following are defined:
The above grammar is unrestricted in its derivation capabilities and directly supports context-dependent rules.
In our study of programming language processing, though, only context-free grammars will be considered. This means that the set is defined on , i.e., only non-terminals are allowed on the left side of a rule.
Even though we gave a formal definition of grammar, seldom will we use them described in that form. Most grammars are described in Bakus-Naur Form (BNF), Extended Backus–Naur Form (EBNF), or other metasyntax.
Consider the following example grammar:
E -> E + E | T T -> T * T | F F -> ( E ) | val
This grammar provides a description of the syntax of simple additive and multiplicative expressions. The language will be any expression involving the two operators, sets of parentheses and values (val). Tokens (terminals) are in blue. Token val represents any number and is a terminal in this grammar: in a compiler implementation this would be recognized at the lexical level.
The FIRST set for a given string or symbol can be computed as follows:
As an example, consider production X -> Y1...Yn
The FIRST set can also be computed for a string Y1...Yn much in the same way as in case 3 above.
The FOLLOW set is computed for non-terminals and indicates the set of terminal symbols that are possible after a given non-terminal. The special symbol $ is used to represent the end of phrase (end of input).
The algorithm should be repeated until the FOLLOW set remains unchanged.
The LOOKAHEAD set is defined for a rule as follows:
LOOKAHEAD(R -> Body) = FIRST(Body) ∪ FOLLOW(R), when Body ε