KeiLab — deep-tech research lab

What MAX is

MAX is a programming language whose syntax is mathematics — not as a library but as the language itself. Definitions are written with ≡, returns with ↩, ranges with ⟳ i ∈ 0..N, conditions with ⊃ / ⊅, matrix multiply with @, Hadamard product with ⊙, Frobenius norm with ‖·‖_F.

It is designed for the workloads the lab actually runs — neural-network training and inference, ML inference kernels, matrix-state dynamics, agent runtimes, OS-level substrates. The compiler emits portable C and direct GPU code (CUDA, HIP), with cross paths for Apple Metal and ARM bare-metal.

Why a custom language

Mainstream stacks bury the math inside library calls. The math the lab works with — matrix-state evolution, Genesis dynamics, neural ODE / closed-form continuous-time networks, oscillator coupling — is far cleaner when the operators are part of the language. MAX programs read like the paper they came from.

Owning the compiler also lets the lab control reproducibility end-to-end. Every research artifact published from the lab can be rebuilt byte-for-byte from source — there is no hidden randomness in the toolchain.

Triple-bootstrap, fixed-point

MAX compiles itself in three stages: seed → stage 1 → stage 2 → stage 3. The compiled outputs of stage 2 and stage 3 are byte-identical (sha-256 equality is checked in CI). That fixed-point invariant means the compiler describes its own behaviour completely — any code-generation change that breaks the invariant is rejected.

Standard library

~200 stdlib modules cover the math the lab leans on — DCT, RSVD, dense linear algebra, statistics and probability, BPE / SentencePiece tokenization, autograd primitives, optimizer steps (Adam / SGD), Kuramoto oscillator dynamics, closed-form continuous-time (CfC) networks, safetensors I/O, GGUF loader, paged KV-cache, FlashAttention-2, RoPE, RMSNorm, SwiGLU, quantization (Q3/Q4/Q5/Q6 K-block, W4A16). Every module header cites the source paper or internal theorem it realizes.

Concrete artifacts shipped on MAX

Qwen3-32B from-scratch GPU forward passexercised in-house on the lab's mixed Blackwell + Hopper + ARM-DGX fleet, with a full GPU backward pass and a parity-validated CPU reference (rel-L2 < 1e-5).
Layer-streaming forward — runs a 64-layer 32B-parameter model at ≈4.4 GB peak RSS by holding one layer at a time. Lossless, byte-identical to bulk forward.
Native HTTP server for chat APIs (OpenAI-compatible), no Python in the serving path.
Falcon-H1-7B chat runtime with on-the-fly Wikipedia ingestion when confidence is low — built on Sadalsuud-style routing primitives in stdlib.
OS substrate work — KeiSei OS boots in QEMU with userspace, ext2-rw, and an EL0 loader, with page-replace policy driven by Genesis math.

What is not public

The compiler engineering — syntax, bootstrap, backends, stdlib API — is openly documented. The math behind specific theorems wired into the language (Genesis dynamics, Sadalsuud routing, substrate decompositions) is patent-pending and stays inside the lab. Engineering partners receive the engineering. Co-development partners receive more under NDA.

cross-references

→ KeiSei OS (MAX-derived OS)→ GPU LLM stack → KeiSeiKit agent runtime ≡ Research portfolio