Optimization Topics

From Wiki**3

Revision as of 00:26, 9 June 2023 by Root (talk | contribs) (Peephole - window over a series of consecutive instructions)

Compiladores
Introdução ao Desenvolvimento de Compiladores
Aspectos Teóricos de Análise Lexical
A Ferramenta Flex
Introdução à Sintaxe
Análise Sintáctica Descendente
Gramáticas Atributivas
A Ferramenta YACC
Análise Sintáctica Ascendente
Análise Semântica
Geração de Código
Tópicos de Optimização

Basic Blocks

Instruction sequence where control flow starts at the first instruction and ends at the last, without jumps (except at the last instruction).

Transformations within a basic block maintain the block semantics (considered as an atomic entity).

Optimization Levels

Compiler optimization is an essential process in compiler design that enhances program performance by minimizing execution time, memory consumption, and other metrics. Various optimization techniques can be employed at different stages of the compilation process, such as user, generic, peephole, local, inter-block, and global optimization. Each optimization level has a different focus, ranging from specific user algorithms to the entire code structure, aiming to streamline the execution process and make the compiled code as efficient as possible.

User: algorithm- and application-level optimizations

User-level, that is, algorithm- and application-level, optimizations are specific to a particular algorithm or application. They are usually done by the developer who writes the program and knows the algorithm's specificities and the application's behavior. These optimizations aim to rewrite or refactor the code to improve its performance, often based on knowledge of the algorithm or application. For instance, a developer could optimize a sorting algorithm by choosing the most efficient one based on the data's expected characteristics.

Generic: considers intermediate code (processor-independent)

Generic optimizations are carried out considering the intermediate code and are processor-independent. These optimizations are typically implemented in the middle-end of the compiler, after the frontend has parsed the code into an intermediate representation but before the backend generates machine code. Generic optimizations can include things like constant folding, dead code elimination, or loop optimization. Because they're not tied to a specific processor or machine architecture, these optimizations can be applied to any program being compiled.

Peephole: window over a series of consecutive instructions

Peephole optimization involves looking at a small window of consecutive instructions in the assembly or machine code and attempting to improve them. The compiler replaces these instructions with a more efficient sequence, reducing redundancy, and improving execution speed. This process could include eliminating unnecessary load and store operations, simplifying arithmetic operations, or replacing slow instructions with faster ones.

Local - optimization within a basic block

Local optimization is focused on enhancing the performance within a basic block (a sequence of code with one entry point and one exit point, with no possibility of branching in between). These blocks are identified after the program has been parsed into an intermediate representation. The optimization here focuses on improving the efficiency of each block by eliminating redundant computations, dead code, and performing register allocation, among others.

Inter-block (information flow between blocks, mainly cycles)

Inter-block optimization takes into consideration the information flow between blocks, mainly cycles. In this phase, the compiler looks for optimization opportunities that involve more than one basic block, such as loop unrolling or loop invariant code motion. These optimizations can help to minimize redundant computations and enhance overall code efficiency.

Global - multiple jumps (jumps to jumps, jumps to the next instruction)

Global optimizations focus on the entire function or program, optimizing across multiple jumps, including jumps to jumps and jumps to the next instruction. These optimizations are more complex than local optimizations, as they require a broader view of the program's control flow. Global optimizations may include function inlining, loop unrolling across multiple basic blocks, or interprocedural optimizations that span multiple functions or procedures. These kinds of optimizations can significantly improve execution speed but can also increase the size of the compiled code.

Machine-independent optimizations

Constant folding

Direct evaluation of expressions using only literal and known values:

 int *x = (int *)malloc(1024 * 4);
 int  y = 3;
 int  z = 20 * y + 10;

Becomes:

 int *x = (int *)malloc(4096);
 int  y = 3;
 int  z = 70;

Elimination of Common Subexpressions

Using temporary variables to store common results:

  d = c * (a + b)
  e = (a + b) / 2

Becomes:

  t = a + b
  d = c * t
  e = t / 2

Cycles / Loops

Loop unrolling

  for (i = 1; i < n; i++)
    a[i] = b[i] + 10;

Becomes:

  for (i = 1; i < (n/4)*4; i+=4) {
    a[i]   = b[i]   + 10;
    a[i+1] = b[i+1] + 10;
    a[i+2] = b[i+2] + 10;
    a[i+3] = b[i+3] + 10;
  }
 
  // rest of the cycle
  for (; i < n; i++)
    a[i] = b[i] + 10;

Moving invariant code

  for (i = 1; i < n; i++) {
    a[i] = b[i] + c * d;
    e = g[k];
  }

Becomes:

  t = c * d;
  for (i = 1; i < n; i++) {
    a[i] = b[i] + t;
  }
  e = g[k];

Variable induction

  for (i = 1; i < n; i++)
    k = i * 4 + m;

Becomes:

  k = m;
  for (i = 1; i < n; i++)
    k = k + 4;

Other

Machine-dependent Optimizations

Algebraic Simplification

 xor eax,eax

instead of

 mov 0,eax

Strength Reduction

 x << 2 + x

instead of

 x * 5

Instruction Reordering

Reorder independent instructions in order to avoid register spilling.

Other

Exercises