Skip to main content
2020 Swift

SML-to-Racket

Source-to-source compiler with lambda calculus IR and Church-encoded algebraic datatypes

SML-to-Racket screenshot

Overview

A source-to-source compiler written in Swift that translates Standard ML programs into executable Racket code, with an optional lambda calculus intermediate representation featuring Church-encoded booleans and tuples. The compiler implements a full front-end pipeline (lexical analysis, recursive-descent parsing, name resolution, type inference with polymorphic type variables) and two independent back-ends: direct Racket code generation and a lambda calculus IR that Church-encodes all data structures as higher-order functions. The codebase spans ~8,500 lines of Swift across 84 source files, based on the official SML '97 Definition.

Architecture

SML-to-Racket architecture

The pipeline flows from lexical analysis through recursive-descent parsing with operator precedence climbing (6 levels), name resolution via a scoped symbol table, and bidirectional type inference. Two independent code generation paths branch after semantic analysis: a direct Racket code generator compiling datatypes to structs with pattern matching, and a lambda calculus generator that Church-encodes all data structures as higher-order functions.

Key Concepts

Church-Encoded Algebraic Datatypes

Church-Encoded Algebraic Datatypes

Each constructor of an n-variant datatype becomes an n-ary lambda that selects the corresponding handler. Constructors with associated values get an extra outer lambda to capture the payload. Case expressions become nested function application -- the encoded value applied to handler functions for each branch.

Code Highlights

Church-Encoded Algebraic Datatypes
func handleCase(_ c: AstCase, index: Int) -> (v: String, expression: Tree) {
  let n = node.cases.count
  let inner = c.associatedType.map { _ in
    Tree.application(fn: .variable(name: variables[index]),
                     value: .variable(name: "a"))
  } ?? Tree.application(fn: .variable(name: variables[index]),
                        value: .constant(value: 0))

  let abstraction = (1...n).reduce(inner) {
    Tree.abstraction(variable: variables[n - $1], expression: $0)
  }

  let datatypeConstructor = c.associatedType.map { _ in
    Tree.abstraction(variable: "a", expression: abstraction)
  } ?? abstraction

  return (v: c.name.name, expression: datatypeConstructor)
}

Highlights

  • Dual-target compiler with direct Racket emission and a lambda calculus IR featuring Church-encoded booleans, tuples, and algebraic datatypes
  • Complete SML front-end with bidirectional type inference supporting polymorphic type variables, algebraic datatypes, and multi-clause pattern matching
  • Pattern matching compilation: nested tuples/lists compiled to Racket cond with auto-generated car/cdr accessor chains
  • Church encoding of algebraic datatypes: each n-variant SML datatype compiles to n lambda terms selecting the i-th handler function