This example implements a simple calculator. The calculator has an unspecified number of integer variables and the common binary integer operators (namely, addition, subtraction, multiplication, division, and modulus), and unary integer operators (+ and -).
The language will contain the following concepts (tokens): VAR (a variable: the corresponding s attribute will contain its name); INT (an integer: the corresponding i attribute contains its value); and the operators (see below).
The lexical analyzer (vars.l) is very simple and limited to recognizing variable names (token VAR), integers numbers (token INT), and the operators themselves. <text> %option noyywrap %{
%} %% [_[:alpha:]][_[:alnum:]]* yylval.s = new std::string(yytext); return VAR; digit:+ yylval.i = strtol(yytext, NULL, 10); return INT; [-+*/%=^:,] return *yytext; .|\n ; /* ignore all the rest */ %% </text>
The syntactic analyzer will be built to immediately compute the expressions in a syntax-directed fashion as they occur. It is, thus, important to use trees that build nodes as the expressions occur (left-recursive grammars). If the grammar were right-recursive, the last node would be the first to be built and the (syntax-directed) evaluation would be from the last expression to the first. <text> %{
static std::map<std::string, int> vars; %}
%union { int i; std::string *s; }
%token BATATA
%token INT
%token VAR
%type<i> expr
%right '=' %left '+' '-' %left '*' '/' '%' %right BATATA
%%
list: stmt
| list stmt ;
stmt: expr ','
| expr ':' { std::cout << $1 << std::endl; } ;
expr: INT { $$ = $1; }
| VAR { $$ = vars[*$1]; delete $1; } | VAR '=' expr { $$ = vars[*$1] = $3; delete $1; } | expr '+' expr { $$ = $1 + $3; } | expr '-' expr { $$ = $1 - $3; } | expr '*' expr { $$ = $1 * $3; } | expr '/' expr { $$ = $1 / $3; } | expr '%' expr { $$ = $1 % $3; } | '+' expr %prec BATATA { $$ = $2; } | '-' expr %prec BATATA { $$ = -$2; } | '(' expr ')' { $$ = $2; } ;
%% extern int yylex(); extern int yyparse(); void yyerror(char *s) { std::cout << s << std::endl; } int main() { yyparse(); } </text>
The Flex specification is processed as follows (the file lex.yy.c is produced):
flex vars.l
The YACC specification is processed as follows (files y.tab.h, needed by the Flex-generated code, and y.tab.c):
yacc -dtv vars.y
Compiling the C/C++ (it is C++ simply because we programmed the extra code in that language):
g++ -c lex.yy.c g++ -c y.tab.c g++ -o vars y.tab.o lex.yy.o