renderbox-engine
Composable type-safe TypeScript DSL that compiles to FFmpeg, ONNX Runtime, and OpenCV
Overview
renderbox-engine is a composable, type-safe DSL library for video, audio, and AI pipelines. Instead of hand-writing FFmpeg commands — where a simple slideshow with Ken Burns effects and crossfade transitions produces a 40-line command — you describe what you want in TypeScript and the compiler figures out how to execute it. The same composable syntax extends to multi-runtime pipelines: a face-redaction pipeline that decodes with FFmpeg, runs ONNX inference for detection, and applies OpenCV blur composes identically to a pure-FFmpeg pipeline. The compiler handles stream labeling, DAG deduplication, crossfade offset arithmetic, parameter escaping, filter graph wiring, runtime partitioning, and boundary marshalling automatically. Published on npm as renderbox-engine.
Architecture
Every pipeline is represented as a typed abstract syntax tree — pure data, not imperative code. Pipelines are JSON-serializable, inspectable, and optimizable before any process is spawned. A shared set of interpreters each traverse the AST to produce different outputs: FFmpeg CLI arguments, optimization rewrites, validation errors, visual graphs, content hashes, or cost estimates. For multi-runtime pipelines, the compiler tags each node with its target runtime, partitions the graph at runtime boundaries, inserts data-format conversion nodes, and emits a serializable execution plan of ordered segments connected by pipe edges. The plan is then dispatched to per-runtime handlers — adding a new runtime means registering a handler, not modifying the compiler.
Key Concepts
Pipelines as Data
Every pipeline is a typed syntax tree — not a sequence of side effects. You can build a sub-pipeline, pass it to a function, optimize it, serialize it, visualize it, or diff it against another pipeline. The tree is the single source of truth; execution is a final interpretation step that leaves the tree unchanged.
Multi-Runtime Compilation
A single pipeline can span FFmpeg (media processing), ONNX Runtime (ML inference), and OpenCV (computer vision). The compiler automatically detects runtime boundaries, partitions the graph into segments, inserts marshalling for data format conversion between runtimes, and emits an execution plan that an executor can run as coordinated processes wired by pipes.
Code Highlights
const slides = [
kenBurns(gentleZoomIn, frames, w, h)(staticImage("exterior.jpg", 4)),
kenBurns(zoomIn, frames, w, h)(staticImage("living-room.jpg", 4)),
kenBurns(panRight, frames, w, h)(staticImage("garden.jpg", 4)),
];
const { video, totalDuration } = slideshow(slides, 1, {
segmentDuration: 4,
transitions: ["fade", "dissolve"],
});
const audio = musicBed({ totalDuration })(input("background-music.mp3"));
const graph = output(video, audio, "output.mp4", {
codec: "libx264", crf: 20, pixelFormat: "yuv420p", faststart: true,
});
await run(optimize(graph));const pipeline = detection.detect("yolov8n-face")(input("surveillance.mp4"));
const redacted = detection.redact("blur", { radius: 30 })(pipeline);
const graph = output(redacted, null, "redacted.mp4", { codec: "libx264" });
// Compiles to: FFmpeg decode → ONNX detect → OpenCV blur → FFmpeg encode
// Runtime boundaries, marshalling, and pipe wiring are automaticimport {
inputVideo, faceRedact, backgroundBlur, TEMPLATES
} from "renderbox-engine";
const video = inputVideo("interview.mp4");
// Direct function call
const graph = faceRedact(video, { method: "blur", confidence: 0.8 });
// Name-based lookup for dynamic dispatch
const builder = TEMPLATES["background-blur"];
const graph2 = builder(video, { radius: 20 });Highlights
- 21,500+ lines of TypeScript across 179 source files with 183 test files — composable DSL that compiles high-level pipeline descriptions into multi-runtime execution plans
- 88+ typed video filters, 50+ typed audio filters, and 9 AI domains (detection, transcription, depth estimation, pose, OCR, scene classification, segmentation) with compile-time stream type safety
- Multi-runtime compiler that automatically partitions a single pipeline graph across FFmpeg, ONNX Runtime, and OpenCV — inserting boundary marshalling and emitting a serializable execution plan
- 12+ pluggable interpreters over a shared typed AST: compile, optimize, validate, visualize (DOT/Mermaid), inspect, hash, cost-estimate, snapshot — each independently testable
- Binary wire format shared with a companion Rust executor for cross-language plan serialization, with golden fixture regression testing across both codebases
Related Projects
PromiseKit
Production Promise library built from algebraic first principles, shipped to ~50K DAU across multiple iOS apps
Designed and built a production Promise library from algebraic first principles -- map derives from flatMap, reduce from fold, operators match Haskell's typeclass precedence hierarchy
fluent-html
Zero-dependency, type-safe HTML builder for TypeScript with compile-time Tailwind CSS and HTMX safety
Designed a mixin architecture using TypeScript declaration merging + prototype assignment that splits a 1,100-line Tag class into focused modules with zero API changes
Redis Server (Zig)
From-scratch Redis-compatible server in Zig with RESP parsing, RDB persistence, and master-replica replication
Implemented a complete Redis-compatible server from scratch in 915 lines of zero-dependency Zig with RESP protocol parser, key-value store with TTL, and RDB binary format codec