Difference between revisions of "Introdução ao Desenvolvimento de Compiladores"

From Wiki**3

(New page: {{TOCright}} == Introduction == === Who Should Read This Document? === This document is for those who seek to use the flex and yacc tools beyond the C programming language and apply obje...)
 
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{TOCright}}
+
__NOTOC__
== Introduction ==
+
{{NAVCompiladores}}
  
=== Who Should Read This Document? ===
+
Our compiler development approach adopts the classical structure, beginning with lexical analysis and moving through syntactic analysis, semantic analysis, and finally code generation. This process utilizes two code generators -- one for creating the scanner and another for developing the syntactic parser -- and leverages modules manually crafted by the programmer.
  
This document is for those who seek to use the flex and yacc tools beyond the C programming language and apply object-oriented (OO) programming techniques to compiler contruction. In the following text, the C++ programming language is used, but the rationale is valid for other OO languages as well. Note, however, that C++ works with C tools almost without change, something that may not be true of other languages (although there may exist tools similar to flex and yacc that support them).
+
While we adhere to the traditional versions of the scanner and parser generators (Flex and BYACC), we transition from C to C++. Our usage of C++ extends beyond merely swapping languages and scripting several classes. Instead, we exploit the full potential of object-oriented programming (OOP) and design patterns, leading to a compiler that is organized in a logical, user-friendly manner, and is amenable to modifications. These features are paramount in terms of programming effort and pedagogical objectives.
  
The use of C++ is not motivated only by a "better" C, a claim some would deny. Rather, it is motivated by the advantages that can be gained from bringing OO design principles in contact with compiler construction problems. In this regard, C++ is a more obvious choice than C (even though one could say that if you have mastered OO design, than you can do it in almost any language, C++ continues to be a better choice than C, simply because it offers direct support for those principles and a strict type system), and is not so far removed that traditional compiler development techniques and tools have to be abandoned.
+
== A Case for C++ ==
  
Going beyond basic OO principles into the world of design patterns is just a small step, but one that contributes much of the overall gains in this change: indeed, effective use of a few choice design patterns -- especially, but not necessarily limited to, the ''composite'' and ''visitor'' design patterns -- contributes to a much more robust compiler and a much easier development process.
+
Employing C++ not only upgrades the quality of C but also allows the native application of OOP architecture principles. Though these principles could be implemented in C, it would entail increased development complexities. Thus, our interest isn't confined to recompiling old C code with a C++ compiler and leaving the outcome to fate. Instead, the incorporation of C++ is designed to impact every facet of compiler development, ranging from the comprehensive organization of the compiler to the composition of each component.
  
The document assumes basic knowlege of object-oriented design as well as abstract data type definition. Knowledge about design patterns is desirable, but not necessary: the patterns used in the text will be briefly presented. Nevertheless, useful insights can be gained from reading a patterns book, such as the "gang of 4-book".
+
The decision to use C++ isn't solely about selecting a programming language; it's also about determining the authorship of the compiler code. If for a human programmer, using C++ is merely a competence issue, tools that generate part of the compiler's code need meticulous selection to ensure the produced code operates as anticipated.
  
===  Regarding C++ ===
+
Several widely-used compiler development support tools, such as the Flex lexical analyzer and the Bison parser generator, natively support C++. Conversely, some tools, like Berkeley YACC (BYACC), only support C. In the former scenario, the generated code and the objects it aids merely need integration into the architecture. In the latter scenario, additional adaptations might be necessary, performed either by the programmer or via specialized wrappers. Despite being C code, BYACC-generated parsers, as we will see, are relatively straightforward to adapt to C++.
  
Using C++ is not only a way of ensuring a "better C", but also a way of being able to use OO architecture principles in a native environment (the same principles could have been applied to C development, at the cost of increased development difficulties). Thus, we are not interested only in taking a C++ compiler, our old C code and "hope for the best". Rather, using C++ is intendend to impact every step of compiler development, from the organization of the compiler as a whole to the makeup of each component.
+
== The Role of Design Patterns ==
  
Using C++ is not only a decision of what language to use to write the code: it is also a matter of who or what writes the compiler code. If for a human programmer using C++ is just a matter of competence, tools that generate some of the compiler's code must be chosen carefully so that the code they generate works as expected.
+
Transitioning from fundamental OOP principles to the realm of design patterns may seem like a small leap, but it brings substantial advantages. Effective use of select design patterns -- particularly, but not limited to, the Composite and Visitor design patterns -- results in a more robust compiler and a streamlined development process.
Some of the most common compiler development support tools already support C++ natively. This is the case of the GNU Flex lexical analyser or the GNU Bison parser generator. Other tools, such as Berkeley YACC (BYACC) support only C. In the former case, the generated code and the objects it supports have only to be integrated into the architecture; in the latter case, further adaptation may be needed, either by the programmer or through specialized wrappers. BYACC-generated parsers, in particular, as will be seen, although they are C code, are simple to adapt to C++.
 
  
=== Organization ===
+
[[category:Compiladores]]
 
+
[[category:Ensino]]
This text parallels both the structure and development process of a compiler. Thus, the first part deals with lexical analysis, or by a different name, with the morphological analysis of the language being recognized. The second part presents syntax analysis in general and LALR(1) parsers in particular.  The fourth part is dedicated to semantic analysis and the deep structure of a program as represented by a languistic structure. Semantic processing also covers code generation, translation, interpretation, as well as the other processes that use similar development processes.
+
[[en:Introduction to Compiler Development]]
 
 
Regarding the appendices, they present the code used throught the document. In particular, detailed descriptions of each hierarchy are presented. Also presented is the structure of the final compiler, in terms of code: both the code developed by the compiler developer, and the support code for compiler development and final program execution.
 
 
 
== Using C++ and the CDK Library ==
 
 
 
 
 
 
 
== Lexical Analysis ==
 
 
 
* Theoretical Aspects of Lexical Analysis
 
* The GNU flex Lexical Analyser
 
* Lexical Analysis Case
 
 
 
== Syntactic Analysis ==
 
 
 
* Theoretical Aspects of Syntax
 
* Using Berkeley YACC
 
* Syntactic Analysis Case
 
 
 
== Semantic Analysis ==
 
 
 
* The Syntax-Semantics Interface
 
* Semantic Analysis and Code Generation
 
 
 
== See Also ==
 
 
 
* The CDK Library
 
* Postfix Code Generator
 
* The Runtime Library
 
 
 
== Further Reading ==
 
 
 
* Aho, A. V.aho av @Aho, A. V., Sethi, R.sethi r @Sethi, R. Ullman, J. D.ullman jd @Ullman, J. D. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley Publishing Company. ISBN 0-20110194-7.
 
* E.gamma e @Gamma, E., Helm, R.helm r @Helm, R., Johnson, R.johnson r @Johnson, R. Vlissides, J.vlissides j @Vlissides, J. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. ISBN 0-201-63361-2.
 
* ISO 2001. Information processing – Text and office systems – Standard Generalized Markup Language (SGML). ISO – International Organization for Standardization. ISO 8879:1986 (standard). Technical committee/subcommittee: JTC 1/SC 34; ISO Standards.
 
* Nasm, The Netwide Assembler. http://freshmeat.net/projects/nasm/
 
* OMG.omg @OMG 2002, January. XML Metadata Interchange (XMI) Specification, v1.2. http://www.omg.org/technology/documents/formal/xmi.htm
 
* Santos, P. R. dos. 2004. postfix.h
 
* W3C. 1999. XSL Transformations (XSLT), Version 1.0. http://www.w3.org/TR/xslt
 
* W3C. 2001. Extensible Markup Language. http://www.w3.org/XML/
 
* W3C. 2001. XML Schema. http://www.w3c.org/XML/Schema
 
* W3C. 2002. Document Object Model. http://www.w3.org/DOM/
 

Latest revision as of 22:57, 5 June 2023

Compiladores
Introdução ao Desenvolvimento de Compiladores
Aspectos Teóricos de Análise Lexical
A Ferramenta Flex
Introdução à Sintaxe
Análise Sintáctica Descendente
Gramáticas Atributivas
A Ferramenta YACC
Análise Sintáctica Ascendente
Análise Semântica
Geração de Código
Tópicos de Optimização

Our compiler development approach adopts the classical structure, beginning with lexical analysis and moving through syntactic analysis, semantic analysis, and finally code generation. This process utilizes two code generators -- one for creating the scanner and another for developing the syntactic parser -- and leverages modules manually crafted by the programmer.

While we adhere to the traditional versions of the scanner and parser generators (Flex and BYACC), we transition from C to C++. Our usage of C++ extends beyond merely swapping languages and scripting several classes. Instead, we exploit the full potential of object-oriented programming (OOP) and design patterns, leading to a compiler that is organized in a logical, user-friendly manner, and is amenable to modifications. These features are paramount in terms of programming effort and pedagogical objectives.

A Case for C++

Employing C++ not only upgrades the quality of C but also allows the native application of OOP architecture principles. Though these principles could be implemented in C, it would entail increased development complexities. Thus, our interest isn't confined to recompiling old C code with a C++ compiler and leaving the outcome to fate. Instead, the incorporation of C++ is designed to impact every facet of compiler development, ranging from the comprehensive organization of the compiler to the composition of each component.

The decision to use C++ isn't solely about selecting a programming language; it's also about determining the authorship of the compiler code. If for a human programmer, using C++ is merely a competence issue, tools that generate part of the compiler's code need meticulous selection to ensure the produced code operates as anticipated.

Several widely-used compiler development support tools, such as the Flex lexical analyzer and the Bison parser generator, natively support C++. Conversely, some tools, like Berkeley YACC (BYACC), only support C. In the former scenario, the generated code and the objects it aids merely need integration into the architecture. In the latter scenario, additional adaptations might be necessary, performed either by the programmer or via specialized wrappers. Despite being C code, BYACC-generated parsers, as we will see, are relatively straightforward to adapt to C++.

The Role of Design Patterns

Transitioning from fundamental OOP principles to the realm of design patterns may seem like a small leap, but it brings substantial advantages. Effective use of select design patterns -- particularly, but not limited to, the Composite and Visitor design patterns -- results in a more robust compiler and a streamlined development process.