clox
Single-pass bytecode compiler and stack-based VM in pure C with Pratt parsing
Overview
clox is a bytecode compiler and stack-based virtual machine for the Lox programming language, implemented in pure C. The system compiles source text in a single pass directly to a compact bytecode representation using a Pratt parser for operator precedence, then executes the bytecode on a register-free stack machine with a tagged-union value representation. The implementation covers trie-based keyword scanning, run-length encoded line information, constant pool with automatic promotion from 8-bit to 24-bit operands, and ~900 lines across 10 modules.
Architecture
The compiler emits bytecode directly during parsing with no AST intermediate, enabled by the Pratt parser which handles operator precedence through a table lookup rather than grammar encoding. A static VM instance avoids pointer indirection on every stack operation. A single Chunk struct owns the instruction stream, constant pool, and source position data, keeping related data co-located.
Code Highlights
static void parsePrecedence(Precedence precedence) {
advance();
ParseFn prefixRule = getRule(parser.previous.type)->prefix;
if (prefixRule == NULL) {
error("Expected expression.");
return;
}
prefixRule();
while (precedence <= getRule(parser.current.type)->precedence) {
advance();
ParseFn infixRule = getRule(parser.previous.type)->infix;
infixRule();
}
}Highlights
- Complete single-pass bytecode compiler in C -- from source text to executing VM -- with no AST intermediate representation
- Pratt parser: the entire expression language parsed by a 12-line function plus a data table
- In-place stack mutation, RLE-compressed debug info, automatic constant-operand size promotion, and unified allocation choke-point for future GC
- Low-level C craftsmanship: manual tagged unions, macro-based generic containers, geometric growth allocators
Related Projects
Blang (Bitis)
Lazy functional language with compile-time ownership and borrowing instead of garbage collection
Novel PL contribution: first lazy functional language using ownership/borrowing instead of GC, introducing self-borrows for cyclic data structures (graphs, infinite streams)
Atheris
Complete compiler for a Swift-like OOP language with vtables and polymorphic dispatch
Designed and implemented a complete compiler from scratch for a Swift-like OOP language (~120 Java source files, ~8,000+ lines)
SML-to-Racket
Source-to-source compiler with lambda calculus IR and Church-encoded algebraic datatypes
Dual-target compiler with direct Racket emission and a lambda calculus IR featuring Church-encoded booleans, tuples, and algebraic datatypes