Atheris
Complete compiler for a Swift-like OOP language with vtables and polymorphic dispatch
Overview
Atheris is a complete compiler and interpreter for a Swift-like object-oriented programming language, built as a BSc thesis at the University of Ljubljana. The language supports classes with single inheritance, interfaces, abstract classes, structures, extensions, enumerations, tuples, arrays, optional types, and Swift-style named parameters with label-based overloading. The compiler implements the full pipeline from source text to execution: lexical analysis, recursive-descent parsing, name resolution, type checking, frame layout computation, IR generation, linearization, and interpretation on a custom stack-based virtual machine.
Architecture
The pipeline is orchestrated as a linear sequence of 7 phases, each consuming the output of the previous. All analysis phases implement a single ASTVisitor interface with 35 visit() overloads. Semantic information is attached to AST nodes via external maps (SymbolDescriptionMap, FrameDescriptionMap, ImcDescriptionMap) rather than fields on the nodes -- keeping the AST clean and enabling multiple independent passes to annotate the same tree.
Key Concepts
Virtual Table Construction & Dynamic Dispatch
Vtables are materialized in memory as [descriptor, base_vtable_pointer, method_labels...]. Override resolution walks the base class vtable; for each inherited method it checks if the current class overrides it. Dynamic dispatch loads the vtable pointer from the object, adds the method's vtable offset, and emits an indirect call through a temp register.
Stack Frame Layout with Static Links
The compiler supports nested functions with lexical scoping via static links. Each frame tracks its static level. When a nested function accesses a variable from an enclosing scope, the IR code traverses the static link chain by dereferencing the frame pointer diff times to reach the enclosing frame.
Code Highlights
private void storeVirtualTable(ImcVirtualTableDataChunk vtableChunk) {
ClassType type = vtableChunk.classType;
CanType baseClass = type.baseClass;
int baseClassVirtualTablePointer = 0;
if (baseClass != null) {
FrmVirtualTableAccess baseVtable = frameDescription
.getVirtualTable((ClassType) baseClass.childType);
baseClassVirtualTablePointer = baseVtable.location;
}
Interpreter.stM(Interpreter.heapPointer, type.descriptor);
Interpreter.heapPointer += Constants.Byte;
Interpreter.stM(Interpreter.heapPointer, baseClassVirtualTablePointer);
Interpreter.heapPointer += Constants.Byte;
}ClassType classType = (ClassType) t;
int indexForMember = classType.indexForMember(
((AstFunctionCallExpression) memberExpression).getStringRepresentation());
int offset = (indexForMember + 2) * Constants.Byte;
FrmTemp frmTemp = new FrmTemp();
ImcTEMP temp = new ImcTEMP(frmTemp);
ImcSEQ methodCallCode = new ImcSEQ();
methodCallCode.stmts.add(
new ImcMOVE(temp,
new ImcBINOP(ImcBINOP.ADD, new ImcMEM(e1), new ImcCONST(offset))));
ImcMethodCALL methodCall = new ImcMethodCALL(frmTemp);
methodCall.args = ((ImcCALL) c2).args;FrmLocAccess loc = (FrmLocAccess) access;
int diff = returnExprFrameStack.peek().staticLevel - loc.frame.staticLevel;
ImcExpr fp = new ImcTEMP(returnExprFrameStack.peek().FP);
for (int i = 0; i < diff; i++) {
fp = new ImcMEM(fp);
}
expr = new ImcMEM(new ImcBINOP(ImcBINOP.ADD, fp,
new ImcCONST(loc.framePointerOffset)));Highlights
- Designed and implemented a complete compiler from scratch for a Swift-like OOP language (~120 Java source files, ~8,000+ lines)
- Built a polymorphic dispatch system with vtable construction, override resolution, and runtime type checking via vtable chain traversal
- Implemented Swift-style named parameter overloading with label-based function disambiguation
- Benchmarked 230% faster than the reference PINS compiler on quicksort
- Directly led to a teaching assistant position designing the compilers course at the university
Related Projects
Blang (Bitis)
Lazy functional language with compile-time ownership and borrowing instead of garbage collection
Novel PL contribution: first lazy functional language using ownership/borrowing instead of GC, introducing self-borrows for cyclic data structures (graphs, infinite streams)
SML-to-Racket
Source-to-source compiler with lambda calculus IR and Church-encoded algebraic datatypes
Dual-target compiler with direct Racket emission and a lambda calculus IR featuring Church-encoded booleans, tuples, and algebraic datatypes
PINS Compiler Course
8 progressive compiler assignments from lexer to interpreter for 100+ university students
Designed a complete 8-assignment compiler course from lexical analysis to a working interpreter (12 to 73 source files incrementally)