Semantic analysis is mostly concerned with types associated with language objects and how these types are used by the language constructs that depend on them, such as functions and arithmetic operators.
Types can be implicitly specified (e.g., in literals) and inferred (e.g., from operations). This is the case of languages such as Python and other scripting languages, able to make type inference at run time. It can also in languages such as C++ (auto) and Java (var), that make type inference at compile time.
On the other hand, typed entities may be explicitly declared. This is how most statically compiled languages work: the program's entities are explicitly typed and types may be verified by the compiler.
This section focuses on type checking, based on the abstract syntax tree's nodes, specifically those that declare typed entities (declarations of typed program entities, such as functions and variables), and those that use those entities (functions and operators). The entities themselves, of course, must remember their own types, so that they may require compliance.
Type information is present in the AST itself. This information may be directly set by the parser, during syntactic analysis, e.g. in declarations, or it may be set -- the most usual way -- during semantic analysis.
The main nodes involved in representing types are the following:
typed_node.h |
---|
#ifndef __CDK15_AST_TYPEDNODE_NODE_H__
#define __CDK15_AST_TYPEDNODE_NODE_H__
#include <cdk/ast/basic_node.h>
#include <cdk/types/types.h>
#include <memory>
namespace cdk {
/**
* Typed nodes store a type description.
*/
class typed_node: public basic_node {
protected:
// This must be a pointer, so that we can anchor a dynamic
// object and be able to change/delete it afterwards.
std::shared_ptr<basic_type> _type;
public:
/**
* @param lineno the source code line number corresponding to
* the node
*/
typed_node(int lineno) :
basic_node(lineno), _type(nullptr) {
}
std::shared_ptr<basic_type> type() {
return _type;
}
void type(std::shared_ptr<basic_type> type) {
_type = type;
}
bool is_typed(typename_type name) const {
return _type->name() == name;
}
};
} // cdk
#endif
|
expression_node.h |
---|
#ifndef __CDK15_AST_EXPRESSIONNODE_NODE_H__
#define __CDK15_AST_EXPRESSIONNODE_NODE_H__
#include <cdk/ast/typed_node.h>
namespace cdk {
/**
* Expressions are typed nodes that have a value.
*/
class expression_node: public typed_node {
protected:
/**
* @param lineno the source code line corresponding to the node
*/
expression_node(int lineno) :
typed_node(lineno) {
}
};
} // cdk
#endif
|
lvalue_node.h |
---|
#ifndef __CDK15_LVALUE_NODE_H__
#define __CDK15_LVALUE_NODE_H__
#include <cdk/ast/typed_node.h>
#include <string>
namespace cdk {
/**
* Class for describing syntactic tree leaves for lvalues.
*/
class lvalue_node: public typed_node {
protected:
lvalue_node(int lineno) :
typed_node(lineno) {
}
};
} // cdk
#endif
|
Any program in any language must manipulate objects to perform its various tasks. These objects are characterized by their structure, which is mapped to the programmer level as data types.
Data types may be explicit, as in the C/C++ family of languages, or they may be inferred from the primitive objects and the operations they are subject to (e.g. CAML).
A interface pública da tabela de símbolos é a seguinte (foram omitidas todas as partes não públicas, assim como os métodos de construção/destruição):
namespace cdk {
template<typename Symbol>
class symbol_table {
public:
void push();
void pop();
bool insert(const std::string &name, std::shared_ptr<Symbol> symbol);
bool replace_local(const std::string &name, std::shared_ptr<Symbol> symbol);
bool replace(const std::string &name, std::shared_ptr<Symbol> symbol);
std::shared_ptr<Symbol> find_local(const std::string &name);
std::shared_ptr<Symbol> find(const std::string &name, size_t from = 0) const;
};
Type checking is the process of verifying whether the types used in the various language constructs are appropriate. It can be performed at compile time (static type checking) or at run time.
The type checking discussed here is the static approach, i.e., checking whether the types used for objects and the operations that manipulate them at compile time are consistent.
In the approach followed by CDK-based compilers, code generation is carried out by visitors that are responsible for traversing the abstract syntax tree and generate, evaluating each node. Node evaluation may depend on the specificities of the data types being manipulated, the simplest of which is the data type's size, important in all memory-related operations.
The following example considers a simple grammar and performs the whole of the semantic analysis process and, finally, generates the corresponding C code. The semantic analysis process must account for variables (they must be declared before they can be used) and for their types (all types must be used correctly).
The following example considers an evolution of Compact, called Simple. Where Compact forces some verification via syntactic analysis (thus, presenting low flexibility), Simple has a richer grammar and, consequently, admits constructions that may not be correct in what concerns types of operators, functions, and their arguments. Type checking in this case is built-in, since, without it, it would be impossible to guarantee the correctness of any expression.