(→The Syntactic Analyzer (YACC) Specification) |
|||
Line 24: | Line 24: | ||
The syntactic analyzer will be built to immediately compute the expressions in a syntax-directed fashion as they occur. It is, thus, important to use trees that build nodes as the expressions occur (left-recursive grammars). If the grammar were right-recursive, the last node would be the first to be built and the (syntax-directed) evaluation would be from the last expression to the first. | The syntactic analyzer will be built to immediately compute the expressions in a syntax-directed fashion as they occur. It is, thus, important to use trees that build nodes as the expressions occur (left-recursive grammars). If the grammar were right-recursive, the last node would be the first to be built and the (syntax-directed) evaluation would be from the last expression to the first. | ||
+ | <text> | ||
+ | %{ | ||
+ | #include <iostream> | ||
+ | #include <string> | ||
+ | #include <map> | ||
+ | static std::map<std::string, int> vars; | ||
+ | %} | ||
− | + | %union { int i; std::string *s; } | |
− | < | + | |
+ | %token BATATA | ||
+ | %token<i> INT | ||
+ | %token<s> VAR | ||
+ | %type<i> expr | ||
+ | |||
+ | %right '=' | ||
+ | %left '+' '-' | ||
+ | %left '*' '/' '%' | ||
+ | %right BATATA | ||
+ | |||
+ | %% | ||
+ | |||
+ | list: stmt | ||
+ | | list stmt | ||
+ | ; | ||
+ | |||
+ | stmt: expr ',' | ||
+ | | expr ':' { std::cout << $1 << std::endl; } | ||
+ | ; | ||
+ | |||
+ | expr: INT { $$ = $1; } | ||
+ | | VAR { $$ = vars[*$1]; delete $1; } | ||
+ | | VAR '=' expr { $$ = vars[*$1] = $3; delete $1; } | ||
+ | | expr '+' expr { $$ = $1 + $3; } | ||
+ | | expr '-' expr { $$ = $1 - $3; } | ||
+ | | expr '*' expr { $$ = $1 * $3; } | ||
+ | | expr '/' expr { $$ = $1 / $3; } | ||
+ | | expr '%' expr { $$ = $1 % $3; } | ||
+ | | '+' expr %prec BATATA { $$ = $2; } | ||
+ | | '-' expr %prec BATATA { $$ = -$2; } | ||
+ | | '(' expr ')' { $$ = $2; } | ||
+ | ; | ||
+ | |||
+ | %% | ||
+ | extern int yylex(); | ||
+ | extern int yyparse(); | ||
+ | void yyerror(char *s) { std::cout << s << std::endl; } | ||
+ | int main() { yyparse(); } | ||
</text> | </text> | ||
This example implements a simple calculator. The calculator has an unspecified number of integer variables and the common binary integer operators (namely, addition, subtraction, multiplication, division, and modulus), and unary integer operators (+ and -).
The language will contain the following concepts (tokens): VAR (a variable: the corresponding s attribute will contain its name); INT (an integer: the corresponding i attribute contains its value); and the operators (see below).
The lexical analyzer (vars.l) is very simple and limited to recognizing variable names (token VAR), integers numbers (token INT), and the operators themselves. <text> %option noyywrap %{
%} %% [_[:alpha:]][_[:alnum:]]* yylval.s = new std::string(yytext); return VAR; digit:+ yylval.i = strtol(yytext, NULL, 10); return INT; [-+*/%=^:,] return *yytext; .|\n ; /* ignore all the rest */ %% </text>
The syntactic analyzer will be built to immediately compute the expressions in a syntax-directed fashion as they occur. It is, thus, important to use trees that build nodes as the expressions occur (left-recursive grammars). If the grammar were right-recursive, the last node would be the first to be built and the (syntax-directed) evaluation would be from the last expression to the first. <text> %{
static std::map<std::string, int> vars; %}
%union { int i; std::string *s; }
%token BATATA
%token INT
%token VAR
%type<i> expr
%right '=' %left '+' '-' %left '*' '/' '%' %right BATATA
%%
list: stmt
| list stmt ;
stmt: expr ','
| expr ':' { std::cout << $1 << std::endl; } ;
expr: INT { $$ = $1; }
| VAR { $$ = vars[*$1]; delete $1; } | VAR '=' expr { $$ = vars[*$1] = $3; delete $1; } | expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { $$ = $1 / $3; } | expr '%' expr { $$ = $1 % $3; } | '+' expr %prec BATATA { $$ = $2; } | '-' expr %prec BATATA { $$ = -$2; } | '(' expr ')' { $$ = $2; } ;
%% extern int yylex(); extern int yyparse(); void yyerror(char *s) { std::cout << s << std::endl; } int main() { yyparse(); } </text>
The Flex specification is processed as follows (the file lex.yy.c is produced):
flex vars.l
The YACC specification is processed as follows (files y.tab.h, needed by the Flex-generated code, and y.tab.c):
yacc -dtv vars.y
Compiling the C/C++ (it is C++ simply because we programmed the extra code in that language):
g++ -c lex.yy.c g++ -c y.tab.c g++ -o vars y.tab.o lex.yy.o