(→Regular Expressions) |
|||
Line 6: | Line 6: | ||
== Regular Expressions == | == Regular Expressions == | ||
− | Regular expressions are defined considering | + | Regular expressions are defined considering a finite alphabet '''Σ = { a, b, ..., c }''' and the empty string '''ε''': |
− | |||
− | |||
− | + | The languages (sets of strings) for each of these entities are: | |
+ | * '''{ε}''', for '''ε''' | ||
+ | * '''{a}''', for an entry '''a''' in '''Σ''' | ||
+ | |||
+ | The following primitive constructors are defined: | ||
* concatenation | * concatenation | ||
* alternative | * alternative | ||
* Kleene-star (*) | * Kleene-star (*) | ||
− | Extensions: | + | Extensions (derived from the above): |
* Transitive closure (+) - a+ ("one or more 'a'") | * Transitive closure (+) - a+ ("one or more 'a'") | ||
* Optionality (?) - a? ("zero or one 'a'") | * Optionality (?) - a? ("zero or one 'a'") |
Lexical analysis, the first step in the compilation process, splits the input data into segments and classifies them. Each segment of the input (a lexeme) will be assigned a label (the token).
In this case, we will be using regular expressions for recognizing portions of the input text.
Regular expressions are defined considering a finite alphabet Σ = { a, b, ..., c } and the empty string ε:
The languages (sets of strings) for each of these entities are:
The following primitive constructors are defined:
Extensions (derived from the above):