Difference between revisions of "Theoretical Aspects of Lexical Analysis"

Revision as of 03:33, 14 March 2008

Regular expressions are defined considering a finite alphabet Î£ = { a, b, ..., c } and the empty string Îµ:

The languages (sets of strings) for each of these entities are:

The following primitive constructors are defined:

Extensions (derived from the above):

Transitive closure (+) - a+ ("one or more 'a'")
Optionality (?) - a? ("zero or one 'a'")
Character classes - [a-z] ("all chars in the 'a-z' range" - only one character is matched)

@@ Line 22: / Line 22: @@
 * Character classes - [a-z] ("all chars in the 'a-z' range" - only one character is matched)
-== Recognizing Regular Expressions ==
+== Recognizing/Matching Regular Expressions ==
 == Building the NFA: Thompson's Algorithm ==