Homework 7
|
The purpose of the compiler is to generate code that corresponds precisely to the semantics of the high level language. The target architecture is the JVM, the code will be JVM assembly language. Just to recap previous work.
Homework 3 defined the concrete grammar necessary for SableCC to generate a parser, however an abstract grammar results in a simpler solution for the type checker.
Homework 4 introduced the concrete to abstract grammar transformations of SableCC3 which will be applied to the MiniJava language. The implementation of the interpreter used the the DepthFirstAdapter class to visit nodes of the AST.
Homework 5 combined those ideas with that of a symbol table to implement a type checker based on the complete abstract grammar definition of MiniJava in SableCC3 form and skeleton classes for the necessary Java.
Homework 6 combined those ideas to determine where data was to be located in the frame (offset) and update the symbol table with that information.
Homework 7 will utilize much of that previous work to determine the stack size requirements of each method. As before, the parse tree and symbol table hold the critical data needed to guide the process. The parse tree will be of particular importance as every node that produces a result (e.g. addition, method call, etc.) requires at least one stack location for the result. The stack size information will be updated in the symbol table.
While the stack size need only be determined for method declarations, the declaration statements and expressions must be fully analyzed to determine the maximum stack requirements. Stack requirements depend in part upon the code generated to implement a method, more efficient code might require less stack space - the point being that we can approximate the stack size now but precise requirements cannot be determined until final code generation details are determined. We will need to revisit stack sizing.
The number of local variables must also be determined before generating code - simple as it is the number of format parameters + the number of variables defined at the beginning of the method.
Organization
The organization of the code generator is illustrated and described below.
- Main calls the following compiler modules:
- BuildSymbolTableVisitor - visits program, class and method declaration AST nodes to add symbol and type to the symbol table.
- TypeCheckVisitor - visits each non-declaration AST node to verify each use of a symbol corresponds to the operations allowed on its type.
- FrameVisitor - visits program, class and method declaration AST nodes to calculate the offsets of variables within the frame and updates the symbol table with the offsets.
- StackVisitor - visits each method declaration node of the parse tree to compute:
- the stack requirements for the method. The stack size is added to the symbol table for that class/method.
- the number of method variables, parameters and locals, can be calculated while visiting the method declaration nodes of the parse tree. The number of parameters and local variables are automatically computed as each are added to the Method object, the SymbolTable.Method class has methods that return the size of a method's parameters and local variables named: sizeParams and sizeVars repectively.
StackVisitor
The table below illustrates how the JVM utilizes the stack to hold method call parameters and intermediate results. The symbol table on the right-side contains:
- variable name, type and offset in the frame. This is a simple computation performed by JVMFrame.
- method stack size and number of method variables (frame size), both are required in a method definition.
Local variable number, which includes this and parameters, is simple to compute. Consider the following local variable numbers:
- main - 2 locals, parameter args and this.
- bm1 - 4 locals, three parameters and this.
- bm2 - 4 locals, two parameters, this, and bl.
Stack size is more challenging as the operations and method calls must be analyzed individually. The stack is used to:
- pass parameters, including this.
- hold results of operations, 4+5 leaves 9 on the stack.
- hold the return value of a method.
- hold operands of operations, x[i]=5 requires 3 stack locations.
Consider the following method stack and requirements:
- main - 5 total. In call to bm1 there are 4, 3 parameters and new B(). An additional location is required for the System.out.println call.
- bm1 - 3 total. In call to bm2 there are 3, 2 parameters and this.
- bm2 - 2 total. The * operation requires 2 stack locations for operands, the result is left on the TOS. The result of * is used with one other operand for the + operation. No more than 2 stack locations are required for any number of binary operations.
class B {
public int bm1(int bp1, int bp2, int bp3) {
return this.bm2( bp1, bp2 );
}
public int bm2(int bp1, int bp2) {
int bl;
bl = bp1 * bp2 + bp1;
return bl;
}
}
class EX7 {
public static void main(String [] args) {
System.out.println(new B().bm1(10, 20, 30));
}
}
Classes
B
Fields
Methods
bm1 class minijava.node.AIntType 3 4
Parameters
bp1 class minijava.node.AIntType 1
bp2 class minijava.node.AIntType 2
bp3 class minijava.node.AIntType 3
Variables
bm2 class minijava.node.AIntType 2 4
Parameters
bp1 class minijava.node.AIntType 1
bp2 class minijava.node.AIntType 2
Variables
bl class minijava.node.AIntType 3
EX7
Fields
Methods
main class minijava.node.AIdType 5 2
Parameters
args class minijava.node.AIdType 0
Variablesclass EX7 {
public static void main(String [] args) {
System.out.println(new B().bm1(10, 20, 30));
}
}.class public EX7
.super java/lang/Object
.method <init>()V
.limit stack 1
.limit locals 1
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public static main([Ljava/lang/String;)V
.limit stack 5
.limit locals 2
getstatic java/lang/System/out Ljava/io/PrintStream;
new B ; new B returns reference to B object
dup ; duplicate the reference
invokespecial B/<init>()V ; B constructor method
bipush 10
bipush 20
bipush 30
invokevirtual B/bm1(III)I ; new B().bm1(10, 20, 30)
invokevirtual java/io/PrintStream/println(I)V
return
.end methodclass B {
public int bm1(int bp1, int bp2, int bp3) {
return this.bm2( bp1, bp2 );
}
public int bm2(int bp1, int bp2) {
int bl;
bl = bp1 * bp2 + bp1;
return bl;
}
}.class public B
.super java/lang/Object
; .method <init>()V same for all MiniJava classes
.method public bm1(III)I
.limit stack 3
.limit locals 4
aload 0 ; this
iload 1 ; bp1
iload 2 ; bp2
invokevirtual B/bm2(II)I ; this.bm2( bp1, bp2 )
ireturn
.end method
.method public bm2(II)I
.limit stack 2
.limit locals 4
iload 1 ; bp1 * bp2
iload 2
imul
iload 1 ; bp1 * bp2 + bp1
iadd
istore 3 ; bl = bp1 * bp2 + bp1
iload 3
ireturn ; return bl
.end method
Nodes that affect the Stack Size
While the stack size need only be determined for method declarations, the declaration statements and expressions must be fully analyzed to determine the maximum stack requirements. The parse tree nodes that affect stack size define the necessary points where stack size calculation must be performed.
A few important points:
- Each method must have a stack allocated.
- Statement operands are on the stack; a 3 operand statement requires 3 stack locations.
- The stack size required by a method is the maximum required by any statement in the method.
- A statement generally releases stack space used by operands.
- The stack size of a statement is the maximum of any sub-statements or expressions making up the statement.
- The actual parameters of a method call are allocated on the stack.
- Each node of the AST can calculate its own stack requirement.
As noted above, some of the nodes that affect stack size are those that:
- have operands since operands must be placed on the stack
- binary operations require two locations for the operands
- method calls require space for each parameter +1 for this
- assignment to an array index requires three locations on the stack; the array, index and the r-value.
- return of a result since the result is placed on the stack
- method executions and common binary operations return a one location result on the stack
Exp - Sizing the Stack
1 operand
Take a simple example, what is the stack size needed for false, for which we will represent by 0. Pushing a 0 places the false on the stack using 1 stack location. The StackVisitor method for AFalseliteralExp should indicate a stack size of 1 is required. The same notion of setting the stack size applies to any Exp node, one that produces a result.
bipush 0 public void caseAFalseliteralExp(AFalseliteralExp node) {
setStack(node,1);
}
Binary operands
Expressions are complicated by the fact that they may contain a method call, we will return to that in a moment.
For the naive case of a binary operation such as 4 + 5 would be implemented at left. The StackVisitor method APlusExp would be as simple as at right.
bipush 4 bipush 5
iadd
public void caseAPlusExp(APlusExp node) {
setStack(node, 2);
}The problem is that the right or left expression could require more than a single stack location. The stack size should be the maximum of the left and right expressions but at least 2. Consider when operands are method calls:
4 + m(1,2)
the left expression requires 1 stack location, the right side 3 (this and 2 parameters); the stack size should be 1+3. Reversing the order of evaluation to: m(1,2)+4 results in a stack size of 3.
Generally, if the left operand stack is greater than the right operand stack, the stack should be the the size of the left operand; if the right operand is greater or equal the left, the stack size is the right operand + 1 (the 1 is for the left operand). Since each node calculates its own stack size, we can determine which of the left and right has the maximum requirement after apply(this). All binary arithmetic, relations (e.g. <) and logical operations (e.g. &&) have essentially the same stack computations.
bipush 4 bipush 5
iadd
public void caseAPlusExp(APlusExp node) {
node.getL().apply(this);
node.getR().apply(this);
int limit = getStack(node.getL()) > getStack(node.getR()) ? getStack(node.getL()) : getStack(node.getR())+1;
limit = limit < 2 ? 2 : limit; // Min stack is 2
setStack(node, limit);
}
AMethodcallExp - Method Call Stack Size
Method calls require stack space for pushing parameters, including this. The simple case is when all parameters are constants or variables (but not fields); the stack size is simply the number of parameters in the expression list + 1 for this. For example, the following call has a stack size of 3:
aload 0 ; this
bipush 1
bipush 2
invokevirtual mm(1,2) When parameters are results of other operations the stack size increases. For example, the following has a stack size of 4 because 2+3 requires stack also:
aload 0 ; this
bipush 1
bipush 2
bipush 3
iadd
invokevirtual mm(1,2+3)
this.m 1 2 3
this.m 1 5 Complicating stack size requirements is that the position of the operation in the parameter list is important to the computation. For example, with the 2+3 as the first parameter the stack size is 3:
aload 0 ; this
bipush 2
bipush 3
iadd
bipush 1
invokevirtual mm(2+3, 1)
this.m 2 3
this.m 5 1 But with 2+3 in the second parameter the stack size is 4:
aload 0 ; this
bipush 1bipush 2
bipush 3
iadd
invokevirtual mm(1, 2+3)
this.m 1 2 3
this.m 1 5 Recall the method call grammar:
exp = {methodcall} exp id exp_list
exp_list = exp*
Calculating the stack size of a method call requires iterating over each parameter of the expression list; each parameter must be placed on the stack for the call. Each parameter adds either 1 when a literal or variable or the parameter stack size requirement (i.e. 2+3 adds 2 to stack size).
The insight is that the stack size is the maximum of any parameter + its position. For example, calling a requires 3 stack locations + 0, while calling b requires 3 stack locations + 1 due to being in position 1. The stack size of m is then 5; 4, the maximum, + 1 for this to call m. The position factor is due to parameters to the left each occupying 1 location, the result of calling a is on the stack during the call to b.
aload 0 ; this.m
aload 0 ; this.a
bipush 2
bipush 3
invokevirtual a
aload 0 ; this.b
bipush 4
bipush 5
invokevirtual b
invokevirtual m
m( a(2,3), b(4,5))
this.m this.a 2 3
this.m a result this.b 4 5
this.m a result b result AMethodDecl - Sizing the Stack
The stack size must be known when method is declared since the method heading appears as:
.method public bm1(III)I
.limit stack 3
.limit locals 4The grammar rule for method declarations is:
method_decl = type id formal_list var_decl* statement* exp;
The method declaration stack size is simply the maximum of the stack size of all statements and the return expression. In the following example, the stack size of the assignment statement is 2 and the return expression 1 so the stack size of the method is max(2, 1) or 2. Iterating over the statement list, the AMethodDecl node method can determine which statement has the maximum stack size.
public int bm2(int bp1, int bp2) { int bl; bl = bp1 * bp2 + bp1; return bl; } .method public bm2(II)I
.limit stack 2
.limit locals 4
iload 1 ; bp1 * bp2
iload 2
imul
iload 1 ; bp1 * bp2 + bp1
iadd
istore 3 ; bl = bp1 * bp2 + bp1
iload 3
ireturn ; return bl
.end method
bp1 bp2
bp1*bp2 bp1
bp1*bp2+bp1
bp1*bp2+bp1 AExpList - Applying the parameters
From the abstract grammar, the method parameters are defined as a list of expressions, requiring the stack size to be calculated for each. Of course, the stack size is caalculated when each parameter is applied so the problem becomes that of accessing each in the expression list, applying the expression and summing the stack size of each.
The AExpList node is where the parameters can be accessed one at a time, the grammar shows the exp* field to be a of list of 0 or more expressions. The exp* is a Java LinkedList object, allowing access to each exp of the list.
See TypeCheckVisitor for an example of use.
The final step is to add the stack size to the corresponding method object in the symbol table. There is a Method setStack method, see SymbolTable.Method file and the following discussion.
Stack Size set and get
Recall the Interpreter from Homework 4 which set the result of each node. The stack size set and get of a node can be defined by the following:
// Get stack size of node
public int getStack(Node node) {
return ((Integer) getIn(node)).intValue();
}// Set stack size of node
public void setStack(Node node, int n) {
setIn(node,new Integer(n));
}
Counting Variables
The SymbolTable.Method class has methods that return the size of a method's parameters and local variables named: sizeParams and sizeVars repectively.
Verification
Analyzing simple Java programs is one way to determine the stack size requirements, actually compiling the Java to a class file then translating to JVM assembly is a good check of your analysis. The disassembler named jasper works by inputting class files and outputting assembler. Instructions for jasper are below.
Use HW6 as a starting point.
An error exists in one Homework 5 file, download BuildSymbolTableVisitor.java to BuildSymbolTableVisitor project directory.
Home: java -jar \sablecc-3-beta.3.altgen.20041114\lib\sablecc.jar minijava.js
IUS: java -jar
v:\common\user\C431\sablecc.jar minijava.js
del Main.class
del minijava\*.class
javac -classpath . Main.java
java -cp . Main Factorial.java
jasper is a disassembler of Java class files to an assembly program. Basically the steps are:
- Cover sheet with your name, C431, date, and Homework 7.
- Print out of StackVisitor.
- Print out of symbol table produced from the three test results.