Compilers 101: You Don't Have To Go Home But You Can't Stay Here

Am I supposed to capitalize all those words? I don't know.

So we're at the end of our compiler. What has happened?

Abstract Syntax Tree

Our compiler has changed a fair amount. Our parser now generates code.

We moved from SLR to LALR because LALR was slightly more robust in allowing us to express one section for a grammar. Ian then modified the provided tool named ILALR to write some lisp code. The tool reads the grammar which has a function attached to each rule. It then processes the grammar into lisp code that create an array where each index of the array corresponds to a rule and the value of the array corresponds to the associated function for that rule.

During the parsing phase, every time we perform a shift on that rule, we execute the function. Each function then performs an actions that would create an abstract syntax tree.

This is a huge step up from dynamically reading in the grammar during run-time and this has greatly increased performance.

Semantic Analysis

This is fun. Since our previous assignment was based on a parse tree and not an abstract syntax tree, our non-existent scoping and typing were crushed. Now we traverse a tree in lisp (which was odd in its own right). Classes in lisp are really classes. They're more sexed up macros. And classes don't have functions. Rather, you create a generic function FOR a base class and overload the method on that class.

btw, there is no way to output the elements of a class without explicitly performing it.

Ian has been on typing for, I think, the best part of 6 days now. Unification has been more challenging that it seems.

Intermediate Representation

There are two purposes for intermediate representation that I can think of. The first is to create a generic structure that abstracts the back-end so that you can create different interpreters for the intermediate representation tree to assembly. The second is for optimization purposes. How? I haven't figured that out yet.

I think we made a reasonbly good decision in going for intermediate representation. If only, to find out that it wasn't a wise choice. I believe this allowed us to modularize the tasks and not have two people stranded in one area.

I currently have the job converting the abstract syntax tree to an intermediate representation tree. (I know Ian will hate that term =P)

Assembly

Dimitri's job is to take care of converting our intermediate representation to assembly. Our target architecture is SPARC V8.

Opinion

So far, I've been a grunt coder. I haven't made that many important design decisions. The abstract syntax tree was architected (wrd?) by Ian. Dimitri had come up with the intermediate representation. My tasks were initially supposed to be switched between Ian and I. However, while Ian was setting up scoping, he ran the nightmare of typing.

The situation may have turned out more ideal than I first thought because since I worked on the abstract syntax tree, I could transition to Dimitri's side. So far, there was a lot of coding with no results on my end. So I was blue-balling coding for a long while.

Anyways, in case you were wondering, Dimitri and I haven't put THAT much consideration in designing our intermediate representation. We realized with a targeted architecture, we didn't need something as abstract as what were specified in our textbooks. We chose instructions that could get us a small set of results that we planned: expressions, ifs, conditionals, procedures, functions, outputting, ...

By the by, we use gcc as our linker and assembler to finish off our compiler. We don't do back-end in this course. Anyways, if you're interested, here is an example of what we produce.

test-basic.adb

-- CS444_04: dgnidash, iwjhalli, rftluong
-- basic test program

package body Main is
b,a : integer;
begin
a := 2;
b := 3;
if (a < b) then
output("%d",b+a);
end if;
--output("%d \n", b);
--some comments
end Main;

Intermediate Representation

0: S1: seq
1: storemem
2: tmpname : (a)
2: storemem
3: tmpname : (__uniq8)
3: intcnst : val (2)
0: S2: seq
1: S1: seq
2: storemem
3: tmpname : (b)
3: storemem
4: tmpname : (__uniq7)
4: intcnst : val (3)
1: S2: seq
2: S1: seq
3: S1: seq
4: storemem
5: tmpname : (__uniq1)
5: condstmt
6: tmpname : (a)
6: LT
6: tmpname : (b)
3: S2: seq
4: ifstmt
5: boolcond : __uniq1
5: if-true :
5: S1: seq
6: storemem
7: tmpname : (__uniq2)
7: S1: seq
8: storemem
9: tmpname : (__uniq3)
9: storemem
10: tmpname : (__uniq4)
10: strcnst : val (%d)
7: S2: seq
8: S1: seq
9: storemem
10: tmpname : (__uniq5)
10: storemem
11: tmpname : (__uniq6)
11: addop : L(b), R(a)
8: S2: seq
9: func : printf ((__uniq3 __uniq5))
5: S2: seq
6: nop
5: if-false :
5: nop
2: S2: seq
3: nop

test-basic.s

.global .umul
.align 4
.global main
main:
save %sp, -232, %sp
mov 2, %l3
st %l3, [%fp-20]
st %l3, [%fp-24]
mov 3, %l3
st %l3, [%fp-28]
st %l3, [%fp-32]
ld [%fp-24], %l1
ld [%fp-32], %l2
cmp %l1, %l2
bge CMP_F_0
set 1, %l3
b CMP_AFT_0
nop
CMP_F_0:
set 0, %l3
CMP_AFT_0:
st %l3, [%fp-36]
ld [%fp-36], %l1
set 1, %l2
cmp %l1, %l2
bne IF_FALSE_1
sethi %hi(string_0), %l3
or %l3, %lo(string_0), %l3
st %l3, [%fp-40]
st %l3, [%fp-44]
ld [%fp-32], %l1
ld [%fp-24], %l2
add %l2, %l1, %l3
st %l3, [%fp-48]
st %l3, [%fp-52]
ld [%fp-44], %o0
ld [%fp-52], %o1
call printf, 0
nop
mov %o0, %l3
st %l3, [%fp-56]
b IF_END_1
nop
IF_FALSE_1:
IF_END_1:

nop
mov 0, %g1
mov %g1, %i0
ret
restore
.align 8
string_0:
.asciz "%d"

Output

> 5

So far we have the following things working:

assignments, expressions, conditionals, and output.

Compilers 101

Sunday, April 01, 2007

You Don't Have To Go Home But You Can't Stay Here

No comments:

Blog Archive

Contributors

Authors' Blogs

Todo List (Robert)