Chapter 7
Translation to
Intermediate Code

powered by FreeFind

Modified: 

RESOURCES

Download Chapters 7 and 8 software

Test package contains test.java that can be used to test your understanding of Chapter 7 and 8. The file contains an example of:

The file can be edited to verify your understanding of the input and results of each phase.

Directions:

  1. Unzip to directory: Chap8
  2. At command prompt, change to Chap8 directory.
  3. Compile by:  javac -classpath . Test\test.java
  4. Execute by: java -cp . Test.test

OVERVIEW

There are two basic approaches for parse tree to code translation:

Architecture

To understand the translation process we will examine a few examples translating an AST node to a linear representation targeted for a specific architecture.

The code will be generated essentially in post-order, after visiting each sub-node and generating code. In practice, the code could be written to a file and then assembled.

386 - For an expression node, we must decide where to place results of code execution. Some possibilities are to generate code to leave the result on stack or in registers. Let's use registers and specifically the eAx register since it is commonly used by commercial compilers. 

APlusExp - Each sub-node of Plus will also use the eAx register to return a result; the result of the left node will be returned in eAx which is overwritten by execution of the code for the right node.

The apply() method generates the code for that node, the following generates code for the left node: node.getL().apply(this);

For the expression: a + 3 * 4, the code generated would be at right:

a + 3 * 4

      +
    /    \
 Id      *
  |     /    \
  a  Literal Literal
       |      |
       3     4
 public void caseAPlusExp(APlusExp node) {
    node.getL().apply(this);

    System.out.println( "Push    eAx");

    node.getR().apply(this);
       
    System.out.println( "Pop      eBx");
    System.out.println( "Add      eAx, eBx");
}

Mov      eAx, a        ; left exp
Push     eAx
Mov      eAx, 3        ; right exp
Mov      eBx, 4
Imul      eBx
Pop       eBx           ; eAx = left + right
Add       eAx, eBx

AWhileStatement - The while has two sub-nodes, an expression and a statement that is executed when the expression is true.

We will generate expression code first by visiting the Exp sub-node, leaving the result of 1 for true or 0 for false in eAx. Statement code will be generated by visiting the Statement sub-node.

Note that MiniJava statements generally produce side-effects but do not return results. Also that the variable a would need to be represented by an absolute address in the heap, frame offset, global variable, etc. The labels would, of course, need to be unique.

while( a < 3 )

    a = a + 1;

    while
    /    \
 Exp Statement
  |        |
a<3   a = a + 1

public void caseAWhileStatement(AWhileStatement node) {

   System.out.println( "while:");

   node.getExp().apply(this);

   System.out.println( "   Cmp     eAx, 0");
   System.out.println( "    Je        endWhile");

   node.getStatement().apply(this);

   System.out.println( "    Jmp     while");
   System.out.println( "endWhile:");
}

while:

Mov      eAx, 1       ; Exp a < 3
Mov      eBx, a       
Cmp     eBx, 3
Jb        true
Mov      eAx, 0

true:

Cmp      eAx, 0
Je          endWhile

Add        a, 1          ; Statement

Jmp        while

endWhile:

Exercise

 

7.1 INTERMEDIATE REPRESENTATION TREES

Requirements for intermediate language

Intermediate representation trees

The intermediate representation is an abstract architecture assembly language.

There are two groups of operators in the assembly language: expressions (Exp) and statements (Stm).

The following are abstract assembly language operations:

Exp - operation to compute a value with possible side-effects
  • CONST(i) - The integer literal 12 would be:

    CONST(12)

  • NAME(n) - A symbolic constant n corresponding to an assembly language label:

    NAME(new Label("printInt"))

  • TEMP(t) - Temporary t, similar to a register but the abstract machine has infinite number. The following constructs a unique Temp number, TEMP(t) is similar then to eAx on an Intel processor, it distinguishes one temporary from another:

    TEMP(new Temp())

  • BINOP(o, e1, e2) - Apply operator o to operands e1 and e2 where e1 is evaluated first. The following is 2+5

    BINOP(BINOP.PLUS,CONST(2),CONST(5))

  • MEM(e) - Dereferences wordsize bytes starting at address e. On left side means store, on right means fetch. The contents of a new Temp numbered 4 is:

    MEM(TEMP(new Temp(4)))

  • CALL(f,l) - Applies function f to argument list l. f is evaluated before evaluating the arguments in left to right order. Apply printInt to the list args:

    CALL( NAME(new Label("printInt")), args)

  • ESEQ(s, e) - Statement s is evaluated with possible side-effects then e is evaluated for results. A side-effect statement that moves 12 to a temporary numbered 4 is:

    MOVE(TEMP(new Temp(4)), CONST(12))

    Performing the side-effect and expression that adds 5 to the temporary: (4) = 12; (4) + 5:

    ESEQ( MOVE(TEMP(new Temp(4)), CONST(12)), BINOP(BINOP.PLUS, TEMP(new Temp(4)), CONST(5)))

    or

    Temp t = new Temp(4);

    ESEQ( MOVE(TEMP(t), CONST(12)), BINOP(BINOP.PLUS, TEMP(t), CONST(5)))

Stm - perform side-effects and control flow
  • MOVE( TEMP t, e) - Evaluate e and move result to temporary t. Move 12 to a temporary numbered 4:

    MOVE(TEMP(new Temp(4)), CONST(12))

    Store the address of temporary 4 to temporary 5 (i.e. 5 references 4):

    MOVE(TEMP(new Temp(5)), MEM(TEMP(new Temp(4))))

  • MOVE( MEM(e1), e2) - Evaluate e1 yielding address a. Evaluate e2 and store wordSize byte result at address a. The following stores CONST(12) at the address of temporary 5:

    MOVE( MEM( TEMP( new Temp(5)), CONST(12))

  • EXP( e ) - Evaluate e and discard result.

    EXP( CONST(12) )

  • JUMP( e, labels ) - Jump to address e which may be literal label or a computed address. C switch( i ) statements may be implemented by arithmetic on i to compute the label from the list to jump. For dataflow analysis of Chapter 17, all possible e locations must be specified. The common case of jumping to a known label is:

    JUMP( new Label( "printInt") )

  • CJUMP( o, e1, e2, t, f ) - Compare evaluation of e1 and e2 using relation operator o; if result is true jump to address t else jump to address f. To jump to while when 3 < 5 is true and endwhile when false:

    CJUMP( CJUMP.LT, CONST(3), CONST(5), new Label("while"), new Label("endwhile"))

  • SEQ( s1, s2 ) - Statement s1 followed by statement s2; the purpose of SEQ is to construct the IR tree, each SEQ defines a parent node of two child statements.

                          SEQ
                         /      \
                        s1     s2

    The following does nothing useful:

    SEQ( EXP( CONST(12) ), EXP( CONST(12) ) );

    The following is invalid because CONST is an Exp rather than a Stm:

    SEQ( CONST(12), CONST(12) );

    This initializes temporary 4 to integer 12 and jumps when less than 30 to while otherwise to endwhile

    SEQ(
        MOVE(TEMP(new Temp(4)), CONST(12)),
        CJUMP( CJUMP.LT, TEMP(new Temp(4)), CONST(30), new Label("while"), new Label("endwhile"))
    )

  • LABEL( n ) - Constant value of name n is the current machine code address, a label definition in assembly language. NAME( n ) may be used for jump, calls, etc. target. Defining a while of:

    while( 2 < 1 )

        Temp4 = 5;

    // Assumes while, T and F are defined as new Labels(...)

    SEQ(
        SEQ(
            SEQ(
                LABEL( while ),
                CJUMP(Tree.CJUMP.LT, CONST(2), CONST(1), T, F))
            ),
            SEQ(
                LABEL( T ),
                SEQ(
                    MOVE(TEMP(new Temp(4)), CONST(5)),
                    JUMP( while )
                )
            )
         ),
         LABEL( F )
    )
                                   SEQ      
                              /               \
                             SEQ            LABEL(F)
                       /                \
                  SEQ                   SEQ   
               /         \             /           \
    LABEL(while)  CJUMP   LABEL(T)    SEQ
                                                   /        \
                                            MOVE       JUMP(while)
                   
       

TREE of SEQs

An AST of a program is a tree, so is an IR of the program. SEQs serve to construct the tree as in the above while implementation.

IR Instruction Implementation
package Tree;  
abstract class Exp{};

public class CONST extends Exp {
   public CONST(int v);

public class NAME extends Exp {
   public NAME(Label l) ;

public class TEMP extends Exp {
   public TEMP(Temp t)

public class BINOP extends Exp {
   public BINOP(int b, Exp l, Exp r);
   public final static int PLUS=0, MINUS=1, MUL=2, DIV=3,
                                AND=4,OR=5,LSHIFT=6,RSHIFT=7,
                                ARSHIFT=8,XOR=9;

public class MEM extends Exp {
   public MEM(Exp e)

public class CALL extends Exp {
   public CALL(Exp f, ExpList a)

public class ESEQ extends Exp {
   public ESEQ(Stm s, Exp e)

abstract class Stm{};

public class MOVE extends Stm {
   public MOVE(Exp d, Exp s)

public class EXP extends Stm {
   public EXP(Exp e)

public class JUMP extends Stm {
   public JUMP(Exp e, LabelList t)
   public JUMP(Label target)

public class CJUMP extends Stm {
   public CJUMP(int rel, Exp l, Exp r, Label t, Label f)
   public final static int EQ=0, NE=1, LT=2, GT=3, LE=4,
                                GE=5, ULT=6, ULE=7, UGT=8, UGE=9;

public class SEQ extends Stm {
   public SEQ(Stm l, Stm r)

public class LABEL extends Stm {
   public LABEL(Label l)

public class ExpList extends Exp {
   public ExpList(Exp head, ExpList tail)
public class StmList {
   public StmList(Stm h, StmList t)

 

Note

To improve readability of all IR examples, each class constructor is invoked via a static method similar to the following:

private static Tree.Exp CONST(int value) { return new Tree.CONST(value); }

so that we can use:

CONST( 3 )

rather than:

new CONST( 3 )

 

7.2 TRANSLATION INTO TREES

Example - To get a feel for translating from AST into IR consider the expression 3 + 4.

Translation requires that each code producing AST node define the corresponding IR code; the result of translating 3 + 4 from AST to IR is one branch of an IR code tree. With most of the AST translating one-to-one IR, the IR tree will have the same structure but a different representation language, one more amenable to analysis and eventual machine language code generation.

3 + 4
public void caseAPlusExp(APlusExp node) {
   node.getL().apply(this);
   node.getR().apply(this);

   // BINOP (PLUS, Exp(L), Exp(R) )
   setExp(node,
      new Ex(
         BINOP(
            Tree.BINOP.PLUS,
            getExp(node.getL()).unEx(),
            getExp(node.getR()).unEx()
         )
      )
   );
}
AST
             APlusExp
         L  /            \  R
           /               \
AIntegerLiteral  AIntegerLiteral
          |                 |
          3                4

IR
                       BINOP    
              /         |          \
BINOP.PLUS  CONST(3)  CONST(4)

MiniJava clearly differentiates between expressions and statements so there is not much need for such translation. However, C and many other languages have statements that behave as expression and vice-versa:

        if( (a=3) > 4) ....

 

Translating Exp's to Stm's - In translating, there are 3 main types of expressions to consider:

Exp - The base Exp translating class:

package Translate;

public abstract class Exp {
  abstract Tree.Exp unEx();
  abstract Tree.Stm unNx();
  abstract Tree.Stm unCx(Temp.Label t, Temp.Label f);
}

Nx - We'll first examine the simple Nx class translation.

Given a Stm the constructor returns null when translating to Ex or Cx and simply returns the Stm for a Nx.

package Translate;

public class Nx extends Exp {
  Tree.Stm stm;

  Nx(Tree.Stm s) { stm = s; }

  Tree.Exp unEx() { return null; }

  Tree.Stm unNx() { return stm; }

  Tree.Stm unCx(Temp.Label t, Temp.Label f) { return null; }
}
 

These aren't good examples but the following does show that Nx prevents nonsense translations:

Translate.Nx a = new Nx(MOVE(TEMP(new Temp()), CONST(1)));

Tree.Exp e = a.unEx();            // e = null

Tree.Stm s1 = a.unCx();         // s1 = null

Tree.Stm s2 = a.unNx();         // s2 = MOVE(TEMP(new Temp()), CONST(1))

A more realistic example demonstrates the implementation of the while.

The while is a Nx so the node is new Nx( ... )

The body of the while must be a Stm of Nx so the following simply returns the body:

body.unNx();

Remember that Ex, Nx and Cx define nodes that can, in some cases be translated from an Ex to an Nx or Cx, etc. This is important in languages such a C which blur statements, expressions and conditionals but not for MiniJava.

 

public void caseAWhileStatement(AWhileStatement node) {
   node.getExp().apply(this);
   node.getStatement().apply(this);

   Label test = new Label();
   Label T = new Label();
   Label F = new Label();
   Exp exp = getExp(node.getExp());
   Exp body = getExp(node.getStatement());

   setExp(node,
      new Nx(
         SEQ(
            SEQ(
               SEQ(
                  LABEL(test),
                  CJUMP(Tree.CJUMP.EQ, exp.unEx(), CONST(1),T,F)
               ),
               SEQ(
                  SEQ(
                      LABEL(T),
                      body.unNx()
                  ),
                  Tree.JUMP( test )
               )
            ),
            LABEL(F)
         )
      )
   );
}
 

For:

    while(i < 5) i = i + 1;


The interesting parts of the while are both Exp's:

exp.unEx() which represents: i < 5

The section on Ex will show how the conditional expression is implemented.

body.unNx() which represents: i = i + 1

i + 1: Is also an Ex.

i = i + 1: Assignment is a Nx done below.

Exercise

               SEQ(
                  LABEL(test),
                  CJUMP(Tree.CJUMP.EQ, exp.unEx(), CONST(1),T,F)
               )
 

                 SEQ(
                      LABEL(T),
                      body.unNx()
                  )

 
public void caseAAssignStatement(AAssignStatement node) {
   node.getId().apply(this);
   node.getExp().apply(this);

   Exp e1 = getExp(node.getId());
   Exp e2 = getExp(node.getExp());

   if (e1.unEx() instanceof Tree.TEMP)
      setExp(node, new Nx(MOVE (e1.unEx(), e2.unEx())));
   else {
      Temp z = new Temp(0);
      setExp(node,
          new Nx(
               MOVE(
                   MEM(
                       BINOP(
                          Tree.BINOP.PLUS,
                          TEMP(z),
                          e1.unEx()
                       )
                    ),
                    e2.unEx()
                )
            )
         );
    }
}

Assignment can be to a TEMP (temporary or register) or MEM (memory).

    Temp z = new Temp(0);

    TEMP(z) + e1.unEx()

is the address of x in the current frame since x would be an offset and Temp(0) is the first allocation in the frame.

 

Ex - We'll examine the Ex class translation methods and try to place more in context of intermediate code generation.

package Translate;

public class Ex extends Exp {
   Tree.Exp exp;

   Ex(Tree.Exp e) { exp = e; }

   Tree.Exp unEx() { return exp; }

   Tree.Stm unNx() { return new Tree.EXP(exp); }

   Tree.Stm unCx(Label t, Label f) {
      if (exp instanceof Tree.CONST) {       // if exp is a constant, emit JUMP statement.
         Tree.CONST c = (Tree.CONST)exp;
         if (c.value == 0)
            return new Tree.JUMP(f);
         else
            return new Tree.JUMP(t);
      }
      return new Tree.CJUMP(Tree.CJUMP.NE, exp, new Tree.CONST(0), t, f);
   }
}

public void caseALtExp(ALtExp node) {
   node.getL().apply(this);
   node.getR().apply(this);

   Exp expl= getExp(node.getL());
   Exp expr= getExp(node.getL());

   Label T = new Label();
   Label F = new Label();
   Temp t = new Temp();

   setExp(node,
      new Ex(
          ESEQ(
              SEQ(
                  SEQ(
                      SEQ(
                          MOVE(TEMP(t),CONST(0)),
                          CJUMP(Tree.CJUMP.LT, expl.unEx(), expr.unEx(), T, F)
                      ),
                     SEQ(
                          LABEL(T),
                          MOVE(TEMP(t),CONST(1))
                     )
                  ),
                  LABEL(F)
               ),
               TEMP(t)
            )
         )
     ) ;
}

Cx - Will be examined later in the context of conditionals.

Exercise - What is the purpose above of:

                      SEQ(
                          MOVE(TEMP(t),CONST(0)),
                          CJUMP(Tree.CJUMP.LT, expl.unEx(), expr.unEx(), T, F)
                      )
 

SIMPLE VARIABLES

InFrame

Variables declared in the current frame are represented by offsets from the current frame pointer.

Primitive data types

Recall that frames hold parameters and local variables which may be primitive (int) or an object (reference). It matters when accessing the variable: a primitive value is in the frame, an object holds a memory reference to structure.

Frame variables are accessed by:

frame pointer + offset(variable)

With fp (frame pointer) at address 101 and variable v at offset 2, address 103 is accessed:

fp + offset(v) = 101 + 2 = 103

The value of a primitive variable v is then accessed by MEM:

MEM(v) = MEM(fp + offset(v)) = MEM(103) = 100

Offset   Frame   MEM address
 
fp+0
fp+1
fp+2  v  
fp+3
34
104
106
100
43
23
121
100
101
102
103
104
105
106

Since offset(v) is fixed at compile time, the computation to access primitive v is then:

    MEM(fp+offset(v)) = MEM(fp+2)

or in IR:

MEM(
    BINOP(
            BINOP.PLUS,
            TEMP(fp),
            CONST(2)
    )
)

where fp is the frame pointer register. You might recall the Frame class holds the definition of the frame pointer in terms of the specific target architecture.

Assignment to an InFrame Primitive

v = 3
Offset   Frame   MEM address
 
fp+0
fp+1
fp+2  v  
fp+3
34
104
106
100
43
23
121
100
101
102
103
104
105
106
MOVE(
    MEM(
       BINOP(
            BINOP.PLUS,
            TEMP(fp),
            CONST(2)
       )
     ),
     CONST(3)
)

References

References must be dereferenced. The general method where v is a reference in the current frame requires nested MEM:

MEM(MEM(v)) = MEM(MEM(fp + offset(v))) = MEM(MEM(103)) = MEM(100) = 34

or in IR:

MEM(
  MEM(
    BINOP(
            BINOP.PLUS,
            TEMP(fp),
            CONST(2)
    )
  )
)

Objects

Offset   Frame   MEM address
 
fp+0 this
fp+1
fp+2  v  
         x
         y
         f
34
104
106
100
43
23
121
100
101
102
103
104
105
106

Objects are references to structures that can only be validly dereferenced through this. Consider an object defined by:

class B {    int x, y, f; }

The general method to dereference a field f where this is at the current frame offset 0 and the offset( f ) is relative to the starting address of the B object:

MEM(MEM(f)) = MEM(MEM(fp)+offset(f)) = MEM(MEM(101)+2) = MEM(104+2) =MEM(106) = 34

or in IR:

MEM(
    BINOP(
            BINOP.PLUS,
            MEM(TEMP(fp)),
            CONST(Offset(f))
    )
)

InReg

Registers are TEMP's and are not in the frame. TEMP variable a is then:

TEMP( a )

Translating variable access

Access by variable ID is then a matter of generating the correct IR node, either a TEMP or CONST for the offset.

public void caseAIdExp(AIdExp node) {
   SimpleExp se = currMethod.getVar( node.getId().toString()).getExp();

   if (se instanceof Temp) {
      setExp(node, new Ex(
                               TEMP((Temp)(se))));
      return;
   }
   else
       if (se instanceof Offset) {
          setExp(node, new Ex(
                                CONST( ((Offset)se).value())));
          return;
       }
   setExp(node, null);
}

 

ARRAY VARIABLES

Some languages (Pascal) define an array variable where on assignment, the values of one array are copied from the right-side variable to the left-side variable. The following copies the 5 values of b to a.

var a, b : array[0..4] of integer;
begin
    a := b;
end

C defines pointers where arrays are constant pointers which cannot be assigned a new reference.

Not legal

  int a[5], b[5];

  a = b;

Legal - a aliases b

  int *a, b[5];

  a = b;

MiniJava defines arrays as variable pointers which can be assigned a new reference. Remember that every non-primitive is a pointer.

a aliases b

  int [] a;
  int [] b;

  b = new int[5];

  a = b;

 

STRUCTURED L-VALUES

 

SUBSCRIPTING AND FIELD SELECTION

C arrays are constant pointers (fixed addresses). Calculating the address of a one-dimensional array subscript based at 0 in C depends on the word size of the array components:

int a[5];

&a[i] = &a + i

For word size greater than 1 the computation is:

&a[i] = &a + i*word size

Java arrays are variable pointers which must be dereferenced. Calculating the address of a one-dimensional array subscript based at 0 in Java also depends on the word size of the array components (using the C meaning of * for dereferencing):

int [] a = int[5];     

&a[i] = *a + i

145 245 765 476 189
100 101 102 103 104

                                     

100

a

For word size greater than 1 the computation is:

&a[i] = *a + i*wordsize

Translation of array variable subscripts then depends upon the language use of arrays. Java arrays are objects or a pointer that requires a dereference for each access or, more likely, the pointer reference would be copied to a register.

Access to a non-indexed object is:

MEM(MEM(v)) = MEM(MEM(fp + offset(v))) = MEM(MEM(103)) = MEM(100) = 34

or in IR:

MEM(
  MEM(
    BINOP(
            BINOP.PLUS,
            TEMP(fp),
            CONST(2)
    )
  )
)
Offset   Frame   MEM address
 
fp+0 
fp+1 
fp+2 v   
fp+3
34
21
78
100
101
23
121
100
101
102
103
104
105
106

 Array access is similar to regular objects with the addition of an index; for an InFrame array a[2] of wordsize 1:

MEM(MEM(a)+i) = MEM(MEM(fp + offset(a))+i) = MEM(MEM(105)+i) = MEM(102+2) = MEM(104) = 87

or in IR:

MEM(
  BINOP(
    BINOP.PLUS,
    MEM(
       BINOP(
            BINOP.PLUS,
            TEMP(fp),
            CONST(4)
       )
    ),
    CONST(1)
  )
)
Offset   Frame   MEM address
 
fp+0 this
fp+1 a[0]
fp+2 a[1]   
fp+3 a[2]
fp+4 a
 
34
201
78
34
87
102
121
100
101
102
103
104
105
106

Translation from AST to IR for wordsize 4 is:

   public void caseAArrayindexExp(AArrayindexExp node) {
    node.getArray().apply(this);
    node.getIndex().apply(this);
 
      Temp t_index = new Temp();
      Temp t_size = new Temp();
      Tree.Exp e1 = getExp(node.getArray()).unEx();
      Tree.Exp e2 = getExp(node.getIndex()).unEx();
 
      Label F = new Label();
      Label T = new Label();
     
      LinkedList argList = new LinkedList();     
      Tree.ExpList args = new Tree.ExpList(argList);
      Tree.Stm s1 =
      SEQ(
         SEQ(
            SEQ(
               SEQ(
                  SEQ(
                     MOVE(TEMP(t_index),BINOP(Tree.BINOP.MUL,e2,CONST(4))),
                     MOVE(TEMP(t_size),MEM(e1))
                   ),
                   CJUMP(Tree.CJUMP.GE,TEMP(t_index),TEMP(t_size),T,F)
                ),

                LABEL(T)
              ),

              MOVE(TEMP(new Temp()), CALL(NAME(new Label("_error")),args))
          ),
          LABEL(F)
      );
     
      Temp t = new Temp();
      Stm s2 =
         SEQ(
             s1,
             MOVE(TEMP(t),
                       MEM(
                              BINOP(BINOP.PLUS,
                                         e1,
                                         BINOP(BINOP.PLUS,
                                                    BINOP(BINOP.MUL, e2, CONST(4)),
                                                   
                                         )
                               )
                       )
              )
          );
      setExp(node, new Ex(ESEQ(s2,TEMP(t))));
  }

INDEX CHECKING

Out of range index errors are a source of many program failures and buffer over-run is exploited by hackers. The following excerpt from the above IR tests that the address computed is within the range of some array addresses.

CJUMP(Tree.CJUMP.GE,TEMP(t_index),TEMP(t_size),T,F)

Exercise

  1. What boundary condition does the above check?
  2. Is the other boundary condition checked?

 

ARITHMETIC

IR arithmetic is a simple translation.

Though no IR unary operators, most can be implemented using binary; unary negation is subtraction from 0. An example:

3 + 4
public void caseAPlusExp(APlusExp node) {
   node.getL().apply(this);
   node.getR().apply(this);

   //BINOP (PLUS, Exp(L), Exp(R) )
   setExp(node,
      new Ex(
         BINOP(
            BINOP.PLUS,
            getExp(node.getL()).unEx(),
            getExp(node.getR()).unEx()
         )
      )
   );
}
AST
             APlusExp
         L  /            \  R
           /               \
AIntegerLiteral  AIntegerLiteral
          |                 |
          3                4

IR
                       BINOP    
              /         |          \
BINOP.PLUS  CONST(3)  CONST(4)

 

 

CONDITIONALS

We have already seen the while translation.

The conditional CJUMP must evaluate expression exp.unEx() to 1 in order to execute the body; the expression most often is a conditional such as x < 4 && y > 3.

The effect is to create a large number of  labels and some rather inefficient branching. Think of the IR for each less than expression and the separate branching required by each.

public void caseAWhileStatement(AWhileStatement node) {
   node.getExp().apply(this);
   node.getStatement().apply(this);

   Label test = new Label();
   Label T = new Label();
   Label F = new Label();
   Exp exp = getExp(node.getExp());
   Exp body = getExp(node.getStatement());

   setExp(node,
      new Nx(
         SEQ(
            SEQ(
               SEQ(
                  LABEL(test),
                  CJUMP(Tree.CJUMP.EQ, exp.unEx(), CONST(1),T,F)
               ),
               SEQ(
                  SEQ(
                      LABEL(T),
                      body.unNx()
                  ),
                  Tree.JUMP( test )
               )
            ),
            LABEL(F)
         )
      )
   );
}

Cx - an abstract class that must be inherited and the unCx method over-ridden; there is no Cx constructor. It is designed to simplify and formalize the IR code for conditionals.

RelCx - translates conditionals. Its purpose is to extend Cx class by defining an unCx with the CJUMP needed by the unEx and unNx methods.

The following has been modified for readability; new Tree. has been removed.

package Translate;
  
public abstract class Cx extends Exp {
 
  Tree.Exp unEx() {
      Temp r = new Temp();
      Label t = new Label();
      Label f = new Label();
 
      return ESEQ(
                   SEQ(
                      MOVE( TEMP(r),  CONST(1)),
                      SEQ(
                          unCx(t, f),
                          SEQ(
                              LABEL(f),
                              SEQ(
                                   MOVE(TEMP(r), CONST(0)),
                                   LABEL(t)
                               )
                           )
                        )
                     ),
                     TEMP(r)
                 );
  }
 
  abstract Tree.Stm unCx(Label t, Label f);
 
  Tree.Stm unNx() {
      Label join = new Label();
 
      return SEQ(
                     unCx(join, join),
                     LABEL(join)
                  );
   }
}
package Translate;

public class RelCx extends Cx {
    int op;
    Tree.Exp left, right;

    RelCx(int o, Tree.Exp l, Tree.Exp r) {
        op = o;
        left = l;
        right = r;
    }

    Tree.Stm unCx(Label t, Label f) {
         return CJUMP(op, left, right, t, f);
    }
}
 

 

IfThenElseExp - Extends Exp to make use of the RelCx. It is listed below but will not be discussed.

package Translate;

public class IfThenElseExp extends Exp {
    Exp cond, a, b;
    Label t = new Label();
    Label f = new Label();
    Label join = new Label();
 
    IfThenElseExp(Exp cc, Exp aa, Exp bb) {
            cond = cc;
            a = aa;
            b = bb;
    }
 
    private static Tree.Stm SEQ(Tree.Stm left, Tree.Stm right) {
            if (left == null)
                return right;
            if (right == null)
                return left;
            return new Tree.SEQ(left, right);
    }
    private static Tree.LABEL LABEL(Label l) { return new Tree.LABEL(l); }
    private static Tree.Exp ESEQ(Tree.Stm stm, Tree.Exp exp) {
            if (stm == null) return exp;
            return new Tree.ESEQ(stm, exp);
    }
    private static Tree.Stm MOVE(Tree.Exp dst, Tree.Exp src) {
            return new Tree.MOVE(dst, src);
    }
    private static Tree.Stm JUMP(Label l) { return new Tree.JUMP(l); }
    private static Tree.Exp TEMP(Temp t)   { return new Tree.TEMP(t); }
 
    Tree.Stm unCx(Label tt, Label ff) {
        Tree.Stm aStm = a.unCx(tt, ff);
        if (aStm instanceof Tree.JUMP) {
                Tree.JUMP aJump = (Tree.JUMP)aStm;
                if (aJump.exp instanceof Tree.NAME) {
                        Tree.NAME aName = (Tree.NAME)aJump.exp;
                        aStm = null;
                        t = aName.label;
                }
            }
        Tree.Stm bStm = b.unCx(tt, ff);
        if (bStm instanceof Tree.JUMP) {
                Tree.JUMP bJump = (Tree.JUMP)bStm;
                if (bJump.exp instanceof Tree.NAME) {
                        Tree.NAME bName = (Tree.NAME)bJump.exp;
                        bStm = null;
                        f = bName.label;
                }
            }
 
        Tree.Stm condStm = cond.unCx(t, f);
 
        if (aStm == null && bStm == null)
                return condStm;
            if (aStm == null)
                return SEQ(condStm, SEQ(LABEL(f), bStm));
            if (bStm == null)
                return SEQ(condStm, SEQ(LABEL(t), aStm));
            return SEQ(condStm,
                           SEQ(SEQ(LABEL(t), aStm),
                               SEQ(LABEL(f), bStm)));
    }
 
    Tree.Exp unEx() {
        Tree.Exp aExp = a.unEx();
        if (aExp == null)
                return null;
            Tree.Exp bExp = b.unEx();
            if (bExp == null)
                return null;
            Temp r = new Temp();
            return ESEQ(SEQ(SEQ(cond.unCx(t, f),
                                        SEQ(SEQ(LABEL(t),
                                                    SEQ(MOVE(TEMP(r), aExp),
                                                            JUMP(join))),
                                                SEQ(LABEL(f),
                                                    SEQ(MOVE(TEMP(r), bExp),
                                                            JUMP(join))))),
                                    LABEL(join)),
                            TEMP(r));
    }
 
    Tree.Stm unNx() {
        Tree.Stm aStm = a.unNx();
        if (aStm == null)
                t = join;
            else
                aStm = SEQ(SEQ(LABEL(t), aStm), JUMP(join));
 
            Tree.Stm bStm = b.unNx();
            if (bStm == null)
                f = join;
            else
                bStm = SEQ(SEQ(LABEL(f), bStm), JUMP(join));
 
            if (aStm == null && bStm == null)
                return cond.unNx();
 
            Tree.Stm condStm = cond.unCx(t, f);
 
            if (aStm == null)
                return SEQ(SEQ(condStm, bStm), LABEL(join));
 
            if (bStm == null)
                return SEQ(SEQ(condStm, aStm), LABEL(join));
 
            return SEQ(SEQ(condStm, SEQ(aStm, bStm)), LABEL(join));
    }
}

IfStatement

The regular IR for the AST AIfStatement node is surprisingly simple. For an if statement of:

if (a < 3 && x < a) x = 2 else a = 3;

public void caseAIfStatement(AIfStatement node) {
    node.getExp().apply(this);
    node.getTrue().apply(this);
    node.getFalse().apply(this);
 
       Label T = new Label();
       Label F = new Label();
       Label D = new Label();
       Exp exp = getExp(node.getExp());
       Exp stmT = getExp(node.getTrue());
       Exp stmF = getExp(node.getFalse());

       setExp(node,
         new Nx(
           SEQ(
              SEQ(
                SEQ(
                  SEQ(
                    CJUMP(Tree.CJUMP.EQ, exp.unEx(), CONST(1),T,F),
                    SEQ(
                        LABEL(T),
                        stmT.unNx()

                     )
                   ),
                   JUMP(D)
               ),
               SEQ(
                   LABEL(F),
                   stmF.unNx()

                )
             ),
             LABEL(D)
           )
         )
      );
}

 

STRINGS

String literals are implemented as a constant address of initialized memory. In assembly language, a label would be generated for the string. For example, Intel assembly:

str1:    db    "Hello World"

In Java, a string is an object so that:

String s = "Hello World";

assigns s the address of the location of "Hello World" string.

The translator needs to make a new label for each literal by: Tree.NAME( new Label("str1") ) and generate an assembly language (not IR Tree) code fragment to reserve and initialize a block of memory with the string.

 

RECORD AND ARRAY CREATION

Primitive scalar variables can be dynamically allocated on the stack as part of the frame.

Records and arrays (objects) can outlive the method in which created to be accessed after the method has completed; the frame is deallocated on method completion so cannot be used for references. Consider:

int [] f() {

    int [] a = new int[5];
        :
        :
    return a;
}

145 245 765 476 189 length 5
100 101 102 103 104

                                     

100

a

Heap - an area of static memory allocated for non-frame data.

Array creation

  1. Determine space requirements. Example requires 5 words+1 for length field of array.
  2. Call external (not IR) function to allocate heap space and return address of first location.
  3. Generate code for initializing array values to 0.
  4. Generate code for saving array length at offset 0.

The AST to IR translation for new int [ exp ] is:

  public void caseANewarrayExp(ANewarrayExp node) {   
       node.getExp().apply(this);
 
       Temp t1 = new Temp();                            // array offset 0 address
       Temp t2 = new Temp();
       Label cj = new Label();
       Label F = new Label();
       Label T = new Label();
 
       Exp exp1 = getExp(node.getExp());
   
    // 1. Determine array size = exp + 1 for wordsize 1
       Tree.Exp size =
               BINOP(
                  Tree.BINOP.MUL,
                  BINOP(Tree.BINOP.PLUS,exp1.unEx(),CONST(1)),
                  CONST(1));                                     // wordsize 1

     // 2. Call external _halloc get pointer to space allocated in t1
       LinkedList argList = new LinkedList();
       argList.add(size);
      
       Tree.ExpList args = new Tree.ExpList(argList);
      
       Tree.Stm s1 =                                            // s1 allocates
          MOVE(
               TEMP(t1),                                           // t1 is start of array
               CALL(NAME(new Label("_halloc")), args));
      
     // 3.   Initialization
       Tree.Stm s2 =                                             // s2 initializes
          SEQ(
             SEQ(
                 SEQ(
                    SEQ(
                       SEQ(
                          SEQ(
                             MOVE(TEMP(t2),CONST(1)),    // start at offset 1
                             SEQ (
                                LABEL(cj),
                                CJUMP(Tree.CJUMP.LT,TEMP(t2),size,F,T)
                              )
                            ),
                            LABEL(T)
                         ),

                         MOVE(
                             MEM(
                                 BINOP(
                                     Tree.BINOP.PLUS,
                                     TEMP(t1),
                                     TEMP(t2)
                                  )
                             ),
                             CONST(0)                             // initialize to 0
                          )
                       ),

                       MOVE(
                           TEMP(t2),
                           BINOP(
                               Tree.BINOP.PLUS,
                               TEMP(t2),
                               CONST(1)                           // wordsize 1
                           )
                       )
                    ),

                    JUMP(cj)
                 ),
           // 4. array length in array offset 0

                 SEQ(                                   
                     LABEL(F),
                     MOVE(
                         MEM(
                             TEMP(t1)),
                             BINOP(
                                  Tree.BINOP.MUL,
                                  exp1.unEx(),
                                  CONST(1)                        // wordsize 1
                              )
                          )
                      )
                  );

          
       setExp(node,
              new Ex(
                  ESEQ(                                             // s1 allocates
                    SEQ(s1,s2),                                   // s2 initialaizes
                    TEMP(t1)                                     // t1 holds heap reference
                  )
               )
       );

  }

Calling runtime-system functions

The above presents an example of calling an external memory allocation function. To allocate size locations where size is an IR Exp:

MOVE(
      TEMP(t1),                                      
      CALL(NAME(new Label("_halloc")), new ExpList(size, null))
);
Exp size = BINOP(BINOP.PLUS,exp1.unEx(),CONST(1));    // for wordsize 1

which stores the returned address to t1, _halloc would likely be a C-library function.

 

WHILE LOOPS

MiniJava IR while has been examined. The more general case that allows break statement would require a label at the end of the while (e.g. done) to branch to unconditionally on execution of the break.

test:

    if (!condition) goto done

        body

        goto test

done:

FOR LOOPS

Can be implemented as a while but fails on boundary conditions that exceed the control variable's representation. We have seen the translation of AST to IR while's. The straightforward translation of for to while is:

for( i=lo; i<=hi; i++)
     body
i=lo
while( i <= hi)
      body
      i++;

 

FUNCTION CALL

Function calls are:

CALL( NAME( new Label( "f"), new ExpList( exp1, exp2, null) )

Method calls require this as an argument and the class and method name be combined to uniquely define the method name:

CALL( NAME( new Label( "class_method"), new ExpList( this, exp1, exp2, null) )

The class_method label would need to correspond to that of a method declaration which defined the IR to execute.

 

STATIC LINKS - Skip

   

7.3 DECLARATIONS

VARIABLE DEFINITION

FUNCTION DEFINITION

prologue public void bm1(int a, int b, int c) { .method public bm1(III)V
.limit stack 1
.limit locals 4
body    b = 12;   MOVE(MEM(BINOP(PLUS, fp, CONST(2))), CONST(12)) 
epilogue }   return
.end method
 

prologue - in target assembly or machine language.

  1. pseudo-instructions needed to announce the beginning of a method
  2. method name label
  3. allocate new frame
  4. instructions to save 'escaping' arguments into frame and move 'non-escaping' arguments to registers
  5. save any registers, including return address

Example - JVM

.method public bm1(III)V
.limit stack 1
.limit locals 4

body

  1. in IR

Example - IR

MOVE(MEM(BINOP(PLUS, fp, CONST(2))), CONST(12)) 

epilogue

  1. return value
  2. restore registers
  3. deallocate frame
  4. return
  5. pseudo-instructions for method end

Example - JVM

  return
.end method


FRAGMENTS

Methods consist of:

Method body fragments and frame pairs are placed in a list.

Assembly code translation will walk the list generating method assembly code.

 

Code to add <frame, body> pair to fragment list.

  • procEntryExit called with the IR code body fragment to pair with the corresponding frame.
private LinkedList frags = new LinkedList();

public void procEntryExit(Exp body) {
  Frame.Frame myframe = level.frame;
  Exp bodyExp = body.unEx();
  Stm bodyStm;

  if (bodyExp != null)
      bodyStm = MOVE(TEMP(myframe.RV()), bodyExp);
  else
      bodyStm = body.unNx();

  ProcFrag frag = new ProcFrag(bodyStm, myframe);
  frags.add(frag);
}

public Iterator  getResults() {
   return frags.iterator();
}

 

CLASSES AND OBJECTS

class Vehicle {
   int position;
   int gas;
   int move( int x ) {
      this.position = x;
   }
}
 100
position
    0
gas
  0
 Vehicle a = new Vehicle()
a
100
 200
position
    0
gas
  0
 Vehicle b = new Vehicle()
b
200

new Vehicle() allocates and initializes a new Vehicle object. 

  • allocate heap space for instance variables (position and gas), parameters and locals are in frame or registers.
  • initialize by iterating through allocated memory

Similar to record and array creation.

Methods and the this pointer

Vehicle a = new Vehicle();
a
.move( 5 );

   int move( int x ) {
      this.position = x;
   }

Exercise - What is the value of this in above example?

 

JVM provides a new method to allocate and initialize objects.

this is the first method parameter at offset 0 of frame.

.method public static main([Ljava/lang/String;)V
.limit stack 2
.limit locals 1
  new             Vehicle                     // object address 
  bipush          5                            // x parameter
  invokevirtual Vehicle/move(I)I
  return
.end method
.method public move(I)V
.limit stack 3
.limit locals 2                                       // putfield Frame
  aload      0                                        // this @ 0
  iload       1                                        // x @ 1
  putfield       Vehicle/position I
  return
.end method
class Vehicle {
   int position;
   int gas;
   int move( int x ) {
      this.position = x;
   }
}
 100
position
    0
gas
  0
 Vehicle a = new Vehicle()
a
100
 200
position
    0
gas
  0
 Vehicle b = new Vehicle()
b
200

Accessing variables

frame - offsets from start of frame

  • this is at offset 0
  • arguments and local variables are in the frame
  • arguments first followed by locals

fields - class and field name from symbol table

  • access via method call to getfield or putfield
  • this first argument