The Postfix reference guide contains information about the structure and operations of the stack machine.
The original stack machine was created by Santos (2004). Is was composed by a set of macros to be used with printf functions. Each macro would “take” as arguments, either a number or a string. This was a simple and effective approach but was limited in its expressiveness.
The current postfix code generator class maintains the stack machine abstraction, but does not rely on macros. Instead, it defines an interface to be used by semantic analysers, as defined by a strategy pattern (Gamma et al., 1995). Specific implementations provide the realization of the postfix commands for a particular target machine. Since it is written in C++, it's very easy to extend to new needs and implementations (new target machines).
Like the original postfix code generator, the current abstraction uses an architecture based on a stack machine, hence the name ``postfix, and three registers.
In some of the following tables, the "Stack" column presents the actions on the values at the top of the stack. Note that only elements relevant in a given context, i.e., that of the postfix instruction being executed, are shown. The notation #length represents a set of length consecutive bytes in the stack, i.e., a vector.
Consider the following example:
$ a #8 b : $ a b
The stack had at its top b, followed by eight bytes, followed by a. After executing some postfix instruction using these elements, the stack has at its top b, followed by a. We use $ to denote the point in the stack not affected by the current operation (this could be the top if the stack were empty).
The following groups of operations are available in the Postfix interface:
These operations start various segments. They do not affect the stack.
BSS | Specifies/selects the data segment for uninitialized values | ||
DATA | Specifies/selects the data segment for initialized values | ||
RODATA | Specifies/selects the data segment for initialized constant values | ||
TEXT | Specifies/selects the text (code) segment |
These operations declare values directly in various segments. They do not affect the stack.
BYTE size | Declares an uninitialized vector with length size (in bytes) | ||
SHORT value | Declares a static 16-bit integer value | ||
CHAR value | Declares a static 8-bit character value | ||
CONST value | Declares a static 32-bit integer value | ||
DOUBLE value | Declares a static double precision (64-bit) floating point value | ||
FLOAT value | Declares a static simple precision (32-bit) floating point value | ||
ID name | Declares a name for an address (i.e., declares the address associated with name) | ||
STR string | Declares a static NULL-terminated character string (C-like) (may contain special characters) |
These operations handle symbols and their definitions within some segment. They do not affect the stack.
ALIGN | Forces the alignment of code or data | ||
LABEL name | Generates a new label name | ||
EXTERN name | Declares name as a symbol externally defined, i.e., defined in another compilation module | ||
GLOBAL name, type | Declares a name with a given type (see below) -- the declaration of a name must preceed its definition |
In a declaration common to several modules, any number of modules may contain common or external declarations, but only one of them may contain an initialized declaration. A declaration does not need to be specified in a specific segment.
Global names may be of different types. These labels are to be used to generate the types needed for the second argument of GLOBAL.
NONE | Unknown type | ||
FUNC | Name/label corresponds to a function | ||
OBJ | Name/label corresponds to an object (data) |
Absolute addressing uses addresses based on named labels. Local addressing is used in function frames and uses offsets relative to the frame pointer to load data: negative addresses correspond to local variables, offset zero contains the previous (saved) value of the frame pointer, offset 4 (32 bits) contains the previous (saved) value of the instruction pointer, and, after offset 8, reside the function arguments.
ADDR name | $ | $ name | Absolute addressing: load address of name |
ADDRA name | $ a | $ | Absolute addressing: store a to name |
ADDRV name | $ | $ [name] | Absolute addressing: load value at name |
LOCAL offset | $ | $ fp+offset | Local addressing: load address of offset |
LOCA offset | $ a | $ | Local addressing: writes a to offset |
LOCV offset | $ | $ [fp+offset] | Local addressing: load value at offset |
ADDRA, ADDRV, LOCA, LOCV are functionally equivalent to ADDR+STORE, ADDR+LOAD, LOCAL+STORE, LOCAL+LOAD, but the generated code is more efficient. They are compound operations (i.e., they contain not only the addressing part, but also the load/store part as well).
The load instructions assume that the top of the stack contains an address pointing to the data to be read. Each load instruction will replace the address at the top of the stack with the contents of the position it points to. Load operations differ only in what they load.
LOAD | $ addr | $ [addr] | Loads 8 bytes (int/float) |
DLOAD | $ addr | $ [addr] | Loads 8 bytes (double) |
LDCHR | $ addr | $ [addr] | Loads 1 byte (char) |
ULDCHR | $ addr | $ [addr] | Loads 1 byte (without sign) (unsigned char) |
LD16 | $ addr | $ [addr] | Loads 2 bytes (short) |
ULD16 | $ addr | $ [addr] | Loads 2 bytes (without sign) (unsigned short) |
Store instructions assume the stack contains at the top the address where data is to be stored. That data is in the stack, immediately after the address. Store instructions differ only in what they store.
STORE | $ val addr | $ | Stores 4 bytes (int/float) |
DSTORE | $ val addr | $ | Stores 8 bytes (double) |
STCHR | $ val addr | $ | Stores 1 byte (char) |
ST16 | $ val addr | $ | Stores 2 bytes (short) |
ALLOC | ||
DUP | ||
DDUP | ||
SWAP | ||
PUSH | ||
DPUSH | ||
POP | ||
DPOP | ||
INT | ||
SP | $ | $ sp |
The arithmetic operations considered here apply to both signed and unsigned integer arguments, and to double precision floating point arguments.
NEG | $ a | $ -a | Negation (symmetric) of integer value |
ADD | $ a b | $ a+b | Integer sum of two integer values |
SUB | $ a b | $ a-b | Integer subtraction of two integer values |
MUL | $ a b | $ a*b | Integer multiplication of two integer values |
DIV | $ a b | $ a/b | Integer division of two integer values |
MOD | $ a b | $ a%b | Remainder of the integer division of two integer values |
UDIV | $ a b | $ a/b | Integer division of two natural (unsigned) integer values |
UMOD | $ a b | $ a%b | Remainder of the integer division of two natural (unsigned) integer values.\\\hline |
These operations take double precision floating
DNEG | $ a | $ -a | Negation (symmetric) |
DADD | $ a b | $ a+b | Sum |
DSUB | $ a b | $ a-b | Subtraction |
DMUL | $ a b | $ a*b | Multiplication |
DDIV | $ a b | $ a/b | Division |
The comparison instructions are binary operations that leave at the top of the stack 0 (zero) or 1 (one), depending on the result result of the comparison: respectively, false or true. The value may be directly used to perform conditional jumps (e.g., JZ, JNZ), that use the value of the top of the stack instead of relying on special processor registers ("flags").
EQ | $ a b | $ a≡b | equal to |
NE | $ a b | $ a≠b | not equal to |
GT | $ a b | $ a>b | greater than |
GE | $ a b | $ a≥b | greater than or equal to |
LE | $ a b | $ a≤b | less than or equal to |
LT | $ a b | $ a<b | less than |
The following consider unsigned operands:
UGT | $ a b | $ a>b | greater than for unsigned integers |
UGE | $ a b | $ a≥b | greater than or equal to for unsigned integers |
ULE | $ a b | $ a≤b | less than or equal to for unsigned integers |
ULT | $ a b | $ a<b | less than for unsigned integers |
This operator compares two double precision floating point numbers. The result is an integer value: less than 0, if the first operand is less than the second; 0, if they are equal; greater than 0, otherwise.
DCMP | $ a b | i | "compare" -- i<0, a<b; i≡0, a≡b; i>0, a>b |
NOT | $ a | $ ~a | Logical (bitwise) negation, i.e., one's complement |
AND | $ a b | $ a∧b | Logical (bitwise) AND operation |
OR | $ a b | $ a∨b | Logical (bitwise) OR operation |
XOR | $ a b | $ a⊕b | Logical (bitwise) XOR (exclusive OR) operation |
rotl rotr shtl shtru shtrs and or not xor
In a stack machine the arguments for a function call are already in the stack. Thus, it is not necessary to put them there (it is enough not to remove them). When building functions that conform to the C calling convetions~\cite{cdecl}, those arguments are destroyed by the caller, \textsl{after} the return of the callee, using \texttt{\small TRASH}, stating the total size (i.e., for all arguments). Similarly, to return values from a function, the callee must call \texttt{\small POP} to store the return value in the accumulator register, so that it survives the destruction of the invocation context. The caller must call \texttt{\small PUSH}, to put the accumulator in the stack. An analogous procedure is valid for DPOP/DPUSH (for double precision floating point return values).
Each function must allocate space for its local variables. This is done immediately after being called and before any other processing. The relevant operations are ENTER (to specify a given memory amount) and START (no space is reserved for local variables. Note that these operations do more than manipulate the stack: they also create an activation register for the function, i.e., they update the frame pointer and define a new stack frame.
ENTER bytes | $ | $ fp #bytes | Starts a function: push the frame pointer (activation register) to the stack and allocate space for local variables (bytes) |
START | $ | $ fp | Equivalent to "ENTER 0" |
POP or DPOP may be called to specify return values in accordance with C conventions. Only return values that fit in these registers need these operations. Other return values are passed by pointer.
Note that these operations make use of specific hardware registers (POP->eax, DPOP->st0).
POP | $ a | $ | Removes a 32-bit integer value from the stack (to eax) |
DPOP | $ d | $ | Removes a double precision (64-bit) floating point value from the stack (to st0) |
The stack frame is destroyed by the LEAVE operation. This action must be performed immediately before returning control to the caller (with RET).
LEAVE | $ fp ... | $ | Ends a function: restores the frame pointer (activation register) and destroys the function-local stack data |
After the function's stack frame is destroyed and the activation register is restored to the caller, control must also be returned to the caller (i.e., IP must be updated).
RET | $ addr | $ | Returns from a function (the stack must contain the return address) |
RETN bytes | $ #bytes addr | $ | Returns from a function and removes bytes from the caller's stack after removing the return address. This is more or less the same as "RET+TRASH bytes". Note that this is not compatible with the Cdecl calling conventions. |
In a stack machine the arguments for a function call are already in the stack. Thus, it is not necessary to put them there (it is enough not to remove them). When building functions that conform to the C calling convetions~\cite{cdecl}, those arguments are destroyed by the caller, \textsl{after} the return of the callee, using \texttt{\small TRASH}, stating the total size (i.e., for all arguments).
To recover the returned value, the caller must call \texttt{\small PUSH}, to put the accumulator in the stack. An analogous procedure is valid for DPOP/DPUSH (for double precision floating point return values).
\verb|CALL(std::string name)| & \texttt{\small } & \texttt{\small addr} & Calls the named function. Stores the return address in the stack.\\\hline
\verb|TRASH(int n)| & \texttt{\small \#n} & \texttt{\small } & Removes \texttt{\small n} bytes from the stack.\\\hline
\verb|PUSH()| & \texttt{\small } & \texttt{\small a} & Pushes the value in the accumulator register to the stack.\\\hline \verb|DPUSH()| & \texttt{\small } & \texttt{\small d} & Pushes the value in the double precision floating point register to the stack.\\\hline
\verb|JMP(std::string)| & \texttt{\small } & \texttt{\small } & Unconditional jump to the label given as argument.\\\hline\hline \verb|LEAP()| & \texttt{\small addr} & \texttt{\small } & Unconditional jump to the address indicated by the value at the top of the stack.\\\hline \verb|BRANCH()| & \texttt{\small addr} & \texttt{\small ret} & Invokes a function at the address indicated by the value at the top of the stack. The return value is pushed to the stack.\\\hline
\verb|JZ(std::string)| & \texttt{\small a} & \texttt{\small } & Jump to the address of the label passed as argument if the value at the top of the stack is 0 (zero).\\\hline \verb|JNZ(std::string)| & \texttt{\small a} & \texttt{\small } & Jump to the address of the label passed as argument if the value at the top of the stack is non-zero.\\\hline
\verb|JGT(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JGE(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JEQ(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JLE(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JLT(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JNE(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline\hline \verb|JUGT(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JUGE(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JULE(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline \verb|JULT(std::string)| & \texttt{\small } & \texttt{\small } & \\\hline
call ret start enter leave trash jmp jz jnz branch leap nil nop