Revision as of 15:15, 3 April 2010

What is a Grammar?

An unrestricted grammar is a quadruple <amsmath>G=(V,\Sigma,R,S)</amsmath>, where <amsmath>V</amsmath> is an alphabet; <amsmath>\Sigma</amsmath> is the set of terminal symbols (<amsmath>\Sigma\subseteq{}V</amsmath>); <amsmath>(V-\Sigma)</amsmath> is the set of non-terminal symbols; <amsmath>S</amsmath> is the initial symbol; and <amsmath>R</amsmath> is a set of rules (a finite subset of <amsmath>(V^*(V-\Sigma)V^*)\times{}V^*</amsmath>).

The following are defined:

Direct derivation: <amsmath>u\underset{\text{\tiny G}}{\Rightarrow}v\;\text{iff}\;\exists_{w_1,w_2\in{}V^*}: \exists_{(u',v')\in{}R}: u=w_1u'w_2 \wedge v=w_1v'w_2</amsmath>
Derivation: <amsmath>w_0\underset{\text{\tiny G}}{\Rightarrow}w_1\underset{\text{\tiny G}}{\Rightarrow}\cdots\underset{\text{\tiny G}}{\Rightarrow}w_n\;\Leftrightarrow{}w_0\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w_n</amsmath>
Generated language: <amsmath>L(G) = \{ w\: |\: w \in \Sigma^* \wedge{}S\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w \}</amsmath>

The FIRST and FOLLOW sets

Computing the FIRST Set

The FIRST set for a given string or symbol can be computed as follows:

If a is a terminal symbol, then FIRST(a) = {a}
If X is a non-terminal symbol and X -> ε is a production then add ε to FIRST(X)
If X is a non-terminal symbol and X -> Y₁...Y_n is a production, then
a ∈ FIRST(X) if a ∈ FIRST(Y_i) and ε ∈ FIRST(Y_j), i>j (i.e., Y_j <amsmath>\overset{*}{\Rightarrow}</amsmath> ε)

As an example, consider production X -> Y₁...Y_n

If Y₁ <amsmath>\overset{*}{\nRightarrow}</amsmath> ε then FIRST(X) = FIRST(Y₁)
If Y₁ <amsmath>\overset{*}{\Rightarrow}</amsmath> ε and Y₂ <amsmath>\overset{*}{\nRightarrow}</amsmath> ε then FIRST(X) = FIRST(Y₁) \ {ε} ∪ FIRST(Y₂)
If Y_i <amsmath>\overset{*}{\Rightarrow}</amsmath> ε (∀i) then FIRST(X) = ∪_i(FIRST(Y_i)\{ε}) ∪ {ε}

The FIRST set can also be computed for a string Y₁...Y_n much in the same way as in case 3 above.

Computing the FOLLOW Set

The FOLLOW set is computed for non-terminals and indicates the set of terminal symbols that are possible after a given non-terminal. The special symbol $ is used to represent the end of phrase (end of input).

If X is the grammar's initial symbol then {$} ⊆ FOLLOW(X)
If A -> αXβ is a production, then FIRST(β)\{ε} ⊆ FOLLOW(X)
If A -> αX or A -> αXβ (β <amsmath>\overset{*}{\Rightarrow}</amsmath> ε), then FOLLOW(A) ⊆ FOLLOW(X)

The algorithm should be repeated until the FOLLOW set remains unchanged.

Exercises

Exercise 1 - simple ambiguous grammar.

@@ Line 3: / Line 3: @@
 = What is a Grammar? =
-Uma gramática é um quádruplo <amsmath>G=(V,\Sigma,R,S)</amsmath>, onde <amsmath>V</amsmath> é um alfabeto; <amsmath>\Sigma</amsmath> é o conjunto de símbolos terminais (<amsmath>\Sigma\subseteq{}V</amsmath>); <amsmath>(V-\Sigma)</amsmath> é o conjunto de símbolos não terminais; <amsmath>S</amsmath> é o símbolo inicial; e <amsmath>R</amsmath> é o conjunto de regras (um subconjunto finito de <amsmath>(V^*(V-\Sigma)V^*)\times{}V^*</amsmath>).  As noções de derivação directa), de derivação, e de linguagem gerada são definidas como se segue:
+An unrestricted grammar is a quadruple <amsmath>G=(V,\Sigma,R,S)</amsmath>, where <amsmath>V</amsmath> is an alphabet; <amsmath>\Sigma</amsmath> is the set of terminal symbols (<amsmath>\Sigma\subseteq{}V</amsmath>); <amsmath>(V-\Sigma)</amsmath> is the set of non-terminal symbols; <amsmath>S</amsmath> is the initial symbol; and <amsmath>R</amsmath> is a set of rules (a finite subset of <amsmath>(V^*(V-\Sigma)V^*)\times{}V^*</amsmath>).
-* <amsmath>u\underset{\text{\tiny G}}{\Rightarrow}v\;\text{sse}\;\exists_{w_1,w_2\in{}V^*}: \exists_{(u',v')\in{}R}: u=w_1u'w_2 \wedge v=w_1v'w_2</amsmath>
+The following are defined:
-* <amsmath>w_0\underset{\text{\tiny G}}{\Rightarrow}w_1\underset{\text{\tiny G}}{\Rightarrow}\cdots\underset{\text{\tiny G}}{\Rightarrow}w_n\;\Leftrightarrow{}w_0\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w_n</amsmath>
-* <amsmath>L(G) = \{ w\: |\: w \in \Sigma^*  \wedge{}S\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w \}</amsmath>
+* Direct derivation: <amsmath>u\underset{\text{\tiny G}}{\Rightarrow}v\;\text{iff}\;\exists_{w_1,w_2\in{}V^*}: \exists_{(u',v')\in{}R}: u=w_1u'w_2 \wedge v=w_1v'w_2</amsmath>
+* Derivation: <amsmath>w_0\underset{\text{\tiny G}}{\Rightarrow}w_1\underset{\text{\tiny G}}{\Rightarrow}\cdots\underset{\text{\tiny G}}{\Rightarrow}w_n\;\Leftrightarrow{}w_0\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w_n</amsmath>
+* Generated language: <amsmath>L(G) = \{ w\: |\: w \in \Sigma^*  \wedge{}S\overset{*}{\underset{\text{\tiny G}}{\Rightarrow}}w \}</amsmath>
 = The FIRST and FOLLOW sets =

Introduction to Syntax: Difference between revisions

From Wiki**3