A programming language designed exclusively for use by language models. Not meant to be human-readable — maximum information density, deterministic compilation, LLVM-based codegen.
Existing programming languages were designed for humans:
- Keywords (function, return, class) consume tokens without adding information
- Syntactic noise (parentheses, semicolons, indentation) exists for human benefit
- Grammar exceptions require memorization, not logic
LLMs generate and consume code token by token. NURL optimizes this process:
| Metric | Python | C | NURL |
|---|---|---|---|
| Tokens for "add two ints" | ~15 | ~12 | ~4 |
| Grammar productions | ~100 | ~200 | ~50 |
| Runtime performance | slow | fast | fast (LLVM) |
| Target platforms | one | many | any LLVM target |
Every syntactic construct is designed to minimize token count without information loss. A single character can carry full semantic meaning.
LLMs predict the next token from context. NURL's grammar has no exceptions — the same construct always works the same way. The grammar fits on a single page.
A token's meaning is derivable from at most 8 tokens of context. No long-range dependencies that could break during generation.
The same source always produces identical output. No UB, no platform differences, no behavioral variation. LLMs can trust code behaves as written.
One compilation pipeline → all target platforms without porting.
NURL source (.nu)
│
▼
Tokenizer
(deterministic, context-free)
│
▼
Parser
(LL(1), ≤3-token lookahead)
│
▼
LLVM IR (.ll)
│
▼
clang
│
┌────┴────────────┐
▼ ▼
native wasm32-wasi
(Linux/Win/macOS) (via WASI SDK)
The compiler (nurlc.nu) is written in NURL itself. The bootstrap runs it
twice over its own source and requires byte-identical LLVM IR on both rounds
before the build is accepted.
Syntax highlighting for VS Code / Windsurf is available in tooling/vscode-nurl/.
Install from VSIX:
1. Ctrl+Shift+P → "Extensions: Install from VSIX..."
2. Select tooling/vscode-nurl/nurl-0.1.0.vsix
The browser-based playground (see below) ships a Monaco port of the same tokenizer — no install required.
A FastAPI container under api/ exposes the compiler over HTTP and hosts a
Monaco-based playground that builds and runs NURL programs as
WebAssembly (wasm32-wasi) directly in the browser via
@bjorn3/browser_wasi_shim.
GET / — playground UI (editor, examples dropdown, build/run/download).GET /health — liveness probe; reports whether nurlc is available.POST /build_wasm — compile NURL source to wasm32-wasi.
Body: {"source":"…","filename":"main.nu","return_format":"json"|"binary","emit_ll":false}.
JSON mode returns base64-encoded wasm + compile logs; binary mode returns
raw application/wasm bytes.GET /examples — list bundled examples (examples/*.nu).GET /examples/{name} — fetch a specific example's source.GET /grammar — current grammar rendered as HTML (from spec/grammar.ebnf).GET /readme — this README rendered as HTML.GET /docs, /redoc, /openapi.json — OpenAPI explorers.From the repository root (the build context must be the repo root so
the Dockerfile can access build.sh, compiler/, stdlib/, examples/,
spec/, README.md):
docker build -f api/Dockerfile -t nurl-api:dev .
docker run --rm -p 8000:8000 nurl-api:dev
# → http://localhost:8000/ (playground)
# → http://localhost:8000/docs (Swagger UI)
nurlc <file.nu> → LLVM IR on stdout.wasm32-wasi ABI
(renames @main → @__main_argc_argv, injects the target triple,
inserts i32/i64 shims for malloc/puts to match libc signatures).clang --target=wasm32-wasi -O2 <ir>.ll /opt/nurl/stdlib/runtime.wasm.o -o out.wasm
using the WASI SDK (24.0) bundled into the image.The wasm-compiled NURL runtime (stdlib/runtime.wasm.o) is baked into the
image at build time. See api/README.md for local-dev instructions without
Docker.
NURL uses prefix notation. The structure is always:
OP ARG1 ARG2 ... ARGN
i — integer (64-bit, signed)
u — integer (64-bit, unsigned)
f — float (64-bit)
b — boolean
s — string (UTF-8, immutable)
v — void
*T — pointer to T
: — binding (variable / struct / enum / const decl)
= — assignment
@ — function definition / aggregate constructor
→ — return type / arrow
. — member access / indexing
( ) — function call
? — ternary conditional / ?T — option type
?? — pattern match (exhaustive)
~ — loop / for-each / mutability prefix / bitwise complement
& — and (logical i1, bitwise i64) / FFI decl prefix
| — or (logical i1, bitwise i64) / enum-decl separator / slice-literal separator
! — logical NOT / Result type prefix (! T E)
\ — try-propagate / closure (lambda)
^ — explicit return
# — type cast
Z — sizeof
% — trait / impl decl
$ — import decl
` — string literal
@ add i a i b → i { ^ + a b }
? > x 0
`positive`
`non-positive`
: i n 0
~ < n 10 {
= n + n 1
}
: Point { i x i y }
@ dist Point p → f {
^ + * . p x . p x
* . p y . p y
}
( add 3 4 )
( dist myPoint )
: i x 10 // immutable — reassignment is a compile error
: ~ i counter 0 // mutable — ~ prefix
= counter + counter 1
: | Json { JNull JBool b JNum i JStr s }
@ describe Json v → s {
^ ?? v {
JNull → `null`
JBool x → ? x `true` `false`
JNum n → ( nurl_str_int n )
JStr s → s
}
}
: [i nums [ i | 1 2 3 4 5 ]
: i total 0
~ n nums { = total + total n }
: (@ i i) square \ i x → i { * x x }
( apply square 7 ) // 49
@ parse s src → ! i ParseErr { ... }
@ sum_two s a s b → ! i ParseErr {
: i x \ ( parse a ) // `\` unwraps Ok, propagates Err
: i y \ ( parse b )
^ @ ! i ParseErr { + x y }
}
% Shape [T] {
@ area T obj → i // required
@ describe T obj → i { // default body
( nurl_print ( nurl_str_int ( area obj ) ) )
^ 0
}
}
% Shape Rect { @ area Rect r → i { ^ * . r w . r h } }
Comparison: sum the numbers 1–100.
Python (~46 tokens):
def sum_to_hundred():
total = 0
for i in range(1, 101):
total += i
return total
NURL (~13 tokens):
@ sumto i n → i {
: i acc 0
: i k 1
~ <= k n { = acc + acc k = k + k 1 }
^ acc
}
The current compiler is deliberately minimal:
: i x 0 is immutable; opt in to mutation with : ~ i x 0. The compiler rejects assignment to immutable bindings at compile time.malloc / free via FFI). No GC pauses, no hidden boxing.[T compiles to { T*, i64 } (pointer + length); the string type s is currently a C-style i8* pointer, but user code can wrap it in a { s ptr, i len } struct for bounds-safe operations (see tests/test_11_fat_strings.nu).?T — compiles to { i1, T }, checked with ?? pattern matching.nurl_free at scope exit automatically. Closures still use RC for captured env.nurl_str_cat, _cat3/4, _int, _float, _slice, nurl_read_file). Default ON, including for the compiler itself: retaining C runtime helpers (nurl_lex_new, nurl_set_last_type, nurl_get_last_type, nurl_argv, nurl_sym_get, nurl_lex_filename, nurl_print_buf_stop) strdup their inputs/outputs so callers can auto-drop safely. ?, ~, and ?? arms scope their : bindings in a new symtab frame so owned-string entries don't leak into sibling branches. Reassigning an owned i8* to a fresh allocating call frees the previous value first; allocating-call results passed inline as call arguments are released right after the callee returns (callee-borrows convention — retaining helpers must strdup).@ T { ... } populates a field directly from a fresh owned allocation, the compiler records a per-field drop against the binding's alloca and emits a load + extractvalue + nurl_free at scope exit. Covers two kinds: (a) i8* fields populated from allocating string calls (nurl_str_cat, _cat3/4, _int, _float, _slice, nurl_read_file); (b) slice [T fields populated from a slice literal [ T | ... ] or a slice-returning call. Conservative by design — only fields populated from a fresh allocation on the spot get a drop, so copying an already-owned binding into a struct does not cause a double-free. Nested owned-struct fields and arm-local struct bindings that fall through (no ^) still leak, same as the existing arm-scoped string behaviour.|) and product types (structs)The compiler emits LLVM IR and delegates native codegen to clang, so any
target clang supports is reachable in principle. Only the first two are
exercised by the build scripts today.
| Platform | Backend | Status |
|---|---|---|
| Linux x86_64 | LLVM | primary dev target — build.sh + tests |
| Windows x86_64 | LLVM | fully supported — build.bat runs the same bootstrap + snapshot test suite as build.sh |
| macOS ARM64 | LLVM | should work via clang; untested |
| WebAssembly | wasm32-wasi | supported via the api/ container (WASI SDK 24.0); browser execution via browser_wasi_shim |
| Android / iOS | LLVM cross | planned |
| Embedded (no_std) | LLVM | planned |
| JVM | JVM bytecode | future |
| .NET CLR | CIL | future |
nurl/
├── spec/ — formal language specification
│ ├── grammar.ebnf ✓ current (v1.1)
│ ├── grammar_v0.1.ebnf … — historical snapshots (v0.1 → v1.0)
│ ├── types.md
│ ├── ir.md
│ └── bootstrapping.md
├── compiler/
│ ├── nurlc.nu ✓ self-hosting compiler, written in NURL
│ ├── nurlc.py — Python bootstrap compiler
│ ├── src/ — Python compiler internals
│ │ ├── lexer.py
│ │ ├── parser.py
│ │ ├── typechecker.py
│ │ ├── ir_gen.py
│ │ └── llvm_gen.py
│ └── tests/ — 80+ `.nu` test programs + snapshot runner
│ ├── run_tests.sh — Linux/macOS test runner
│ ├── run_tests.bat — Windows test runner
│ ├── correct.txt — golden baseline (status + output per test)
│ └── *.nu — positive and negative tests
├── stdlib/
│ ├── runtime.c ✓ C runtime (I/O, string helpers, FFI surface)
│ ├── runtime.o — native host build
│ └── runtime.wasm.o — wasm32-wasi build (produced inside the API image)
├── examples/ — curated `.nu` programs surfaced by the playground
│ ├── showcase.nu calculator.nu fizzbuzz.nu collatz.nu wordcount.nu
│ └── enigma.nu slice_test.nu test_05_closures_and_capture.nu …
├── api/ — FastAPI container (compiler-as-a-service + playground)
│ ├── Dockerfile — multi-stage build; installs WASI SDK; bootstraps nurlc
│ ├── app/main.py — endpoints, IR-rewrite shims, docs rendering
│ ├── static/index.html — Monaco-based playground, runs wasm in-browser
│ └── requirements.txt
├── tooling/
│ └── vscode-nurl/ — VS Code / Windsurf syntax-highlighting extension
├── build/ — all bootstrap artefacts land here
│ ├── nurlc_py(.ll) — stage 0: Python-compiled `nurlc.nu`
│ ├── nurlc_self(.ll) — stage 1: self-compiled
│ ├── nurlc_self2(.ll) — stage 2: fixed-point check
│ └── nurlc — final self-hosting binary
├── build.sh / build.bat — full bootstrap + test-suite driver
├── clean.sh / clean.bat — remove build artefacts
├── nurl.sh / nurl.bat — convenience wrapper to compile a `.nu` file
└── nurlc — symlink to build/nurlc (Linux/macOS)
Current language version: grammar v1.1 (spec/grammar.ebnf).
Historical grammar snapshots are kept under spec/grammar_v0.1.ebnf … spec/grammar_v1.0.ebnf.
spec/grammar.ebnf)spec/types.md)spec/ir.md)compiler/nurlc.py)nurlc --llvm)stdlib/runtime.c + built-in FFI (nurl_print, nurl_str_int, …)compiler/nurlc.nu written in NURL! T E with try-propagation \~ prefix), bitwise vs short-circuit & | dispatched by operand type, higher-order functionsClientErr 404), fat-pointer string idiom, 3-payload match bindings, BOOL patterns for Option-shaped tagswasm32-wasi compilation via WASI SDK 24.0, shipped as a FastAPI container (api/) with in-browser execution through browser_wasi_shim. IR-rewrite layer handles libc-ABI shims (malloc, puts, __main_argc_argv).api/static/index.html), live grammar/README rendering (/grammar, /readme), examples browser (/examples).$ `path` alias → alias__name), mod::symbol namespace syntax with lexer-level name-mangling (a::b → a__b). Remaining: visibility (pub / implicit-private), package manifest, remote dep resolution.[x] Exhaustive match checking: ?? expressions require all enum variants to be covered at compile time; a _ wildcard arm satisfies the requirement. Missing arms produce a compile error: non-exhaustive match on T, unhandled: Variant. Duplicate arms are also rejected: duplicate match arm for variant: V. See tests/t1_match_*.nu … t7_match_*.nu.
[x] Result type ! T E and try-propagation \: ! T E compiles to { i1, i64 } (flag + payload). @ ! T E { T val } and @ ! T E { F err } construct Ok/Err values. \ (try operator) unwraps Ok or propagates Err from the enclosing function, preserving the error payload. Compile-time checks: \ on a non-Result type produces error: try operator \ used on non-Result type: X; mismatched error types produce error: try propagation type mismatch — function returns ! T E1 but \ received ! T E2. See tests/t8_result_*.nu … t15_result_type_mismatch.nu.
\ param* → T { body }) with by-value capture of enclosing locals. Function values are compiled as a { fn_ptr, env_ptr } pair; the function pointer takes an implicit leading i8* env argument. See tests/test_05_closures_and_capture.nu, test_closure_basic.nu, test_07_closure_lifetimes.nu.: defaults to immutable; : ~ opts into mutation. Applies to both let_stmt and const_decl. Reassigning an immutable binding is a compile error. See tests/test_immutable_*.nu.& | dispatched by LLVM type — on i1 they are logical with short-circuit evaluation; on integers they are bitwise AND/OR. Any other operand type is a compile error.map/filter/fold-style functions that take (@ R P*) parameters and instantiate generic slice type parameters. See tests/test_08_higher_order.nu.tests/test_09_trait_defaults.nu.ClientErr 404 → …) that dispatches when the payload equals the literal; _ still covers the fallback. See tests/simple_json_match.nu, routertrap.nu.% Drop T { @ drop T self → v { … } } trait invoked at scope exit. See stdlib/STDLIB.md and compiler/tests/arm_local_drop.nu, struct_nested_field_drop.nu.mod::symbol lexer fusion — duplicate-include guard, $ \path` aliasmangles imported top-level@-functions toalias__name, and the lexer fusesmod::syminto a singlemod__symIDENT (matched in bothruntime.candcompiler/src/lexer.py). Seecompiler/tests/alias_import_mod.nu,alias_import_use.nu`.{ s ptr, i len }, construct from strlen, pass to write(2) with explicit length. See tests/test_11_fat_strings.nu.T / F as match-arm names match an i1 tag directly (?? some_opt { T v → v F → 0 }).| Tool | Purpose |
|---|---|
| Python 3.8+ | Python reference compiler (compiler/nurlc.py) |
| clang / LLVM 14+ | Compile LLVM IR (.ll) to native binary |
Install LLVM from llvm.org/releases (choose the Windows installer for the latest stable release).
The installer adds clang.exe and related tools to PATH.
You can use Command Prompt, PowerShell, or Git Bash for the commands below.
sudo apt install python3 clang
sudo dnf install python3 clang
brew install llvm
# Add LLVM to PATH for this shell (add to ~/.zshrc or ~/.bash_profile to persist):
export PATH="$(brew --prefix llvm)/bin:$PATH"
# Linux / macOS
clang -c stdlib/runtime.c -o stdlib/runtime.o
# Windows (CMD / PowerShell)
clang -c stdlib\runtime.c -o stdlib\runtime.o
stdlib/runtime.ois already checked in; rebuild it only if you modifyruntime.c.
Use the automated build scripts to bootstrap the compiler and verify stability:
# Linux / macOS
./build.sh
# Windows (CMD / PowerShell)
build.bat
The build script performs a complete bootstrap process:
1. Compiles nurlc.nu with the Python reference compiler → build/nurlc_py
2. Compiles nurlc.nu with the stage-0 binary → build/nurlc_self (stage 1)
3. Compiles nurlc.nu with stage 1 → build/nurlc_self2 (stage 2)
4. Verifies stages 1 and 2 produce byte-identical LLVM IR (bootstrap fixed point)
5. Copies stage 2 to build/nurlc and symlinks it at the repo root
6. Runs the snapshot test suite (compiler/tests/run_tests.sh on Linux/macOS, compiler/tests/run_tests.bat on Windows) and diffs against correct.txt
All build artefacts are stored under build/. The run prints
BUILD SUCCESS & TESTS PASSED on success, or the full log / diff on failure.
Clean build artifacts:
# Linux / macOS
./clean.sh
# Windows (CMD / PowerShell)
clean.bat
Manual build (if needed):
# Create build directory
mkdir -p build # Linux/macOS
mkdir build # Windows
# Generate LLVM IR using Python compiler
python compiler/nurlc.py --llvm compiler/nurlc.nu > build/nurlc.ll
# Link into native binary
clang build/nurlc.ll stdlib/runtime.o -o build/nurlc # Linux/macOS
clang build\nurlc.ll stdlib\runtime.o -o build\nurlc.exe # Windows
.nu fileRecommended (automated):
# Linux / macOS
./nurl.sh myprogram.nu # Creates myprogram binary
./nurl.sh myprogram.nu myoutput # Creates myoutput binary
# Windows
nurl.bat myprogram.nu # Creates myprogram.exe
nurl.bat myprogram.nu myoutput # Creates myoutput.exe
Manual (two-step):
# Linux / macOS
./nurlc myprogram.nu > myprogram.ll # or build/nurlc
clang myprogram.ll stdlib/runtime.o -o myprogram
./myprogram
# Windows
nurlc.exe myprogram.nu > myprogram.ll # or build\nurlc.exe
clang myprogram.ll stdlib\runtime.o -o myprogram.exe
myprogram.exe
The Python reference compiler (compiler/nurlc.py) exists solely to bootstrap the
self-hosting compiler. It implements the subset of grammar v1.1 that nurlc.nu itself
uses — structs, functions, the :/=/@/^/?/~/(/./# core, basic traits
and impls — and omits most of the features added in Groups D–F:
ffi_decl, enum_decl, defer_stmt, try_expr (\), sizeof_expr (Z), agg_expr, res_type (! T E), closures, slice literals, for-each, generic instantiation, integer-indexed member_expr.fn_type as @ R P* in type position, while the grammar spec and nurlc.nu both use (@ R P*).Anything beyond the bootstrap subset must be compiled with the self-hosted
build/nurlc binary. The Python compiler is not a user-facing tool.
The following are known limitations of the current compiler (nurlc.nu, grammar v1.1).
They reflect deliberate scope decisions rather than bugs, and are tracked for future work.
| Limitation | Workaround |
|---|---|
Single-letter type keywords (i u f b s v) cannot be used as variable names with type inference |
Use an explicit type annotation: : i n expr |
No sized types (i8, u32, f64 …) — lexer emits i + 8 as two tokens |
Use base types (i, f) and cast with # |
zext / trunc casts not implemented — i1 cannot be widened to i64 directly |
Use nurl_print_bool for boolean output; avoid mixing i1 and i64 |
! T E payload is stored as i64; complex T/E types (structs with payloads > 8 bytes) may not round-trip correctly through # cast |
Use base types and simple enums (tag-only) as T and E |
| Limitation | Workaround |
|---|---|
Variadic functions (e.g. printf) cannot be declared via ffi_decl — LLVM IR varargs syntax (...) is not generated |
Use nurl_print_* builtins; declare specific non-variadic wrappers in C |
| No tail-call optimisation — deep recursion may stack-overflow | Use explicit loops (~) |
| Closures capture by value (snapshot at construction); mutating an enclosing local after the closure is built does not affect the captured copy | Keep mutation explicit; pass the current value as a parameter or return the new state |
| Limitation | Workaround |
|---|---|
| Enum variants with a named-struct payload require the struct to be declared before the enum in the same file — forward references are not supported | Order declarations: structs first, enums after |
| Pattern matching binds at most 2 payload variables per arm — variants with 3+ payloads cannot fully destructure in a single arm | Access additional payload fields via separate . extraction after matching |
| Limitation | Workaround |
|---|---|
import_decl is a static inline-include (like #include) — the imported file is compiled into the same LLVM module |
Avoid importing files that define main; avoid circular imports |
Import alias ($ `path` alias) is parsed but ignored — all imported names land in the global namespace |
Prefix imported names manually (e.g. math_sin, math_cos) |
| No duplicate-include guard — importing the same file twice emits duplicate definitions | Import each file at most once |
| Limitation | Workaround |
|---|---|
Negative integer literals cannot be written directly — -1 tokenises as MINUS INT(1) |
Use ~ 0 (bitwise complement) for -1; compute negatives as - 0 n |
No automatic memory management — heap-allocated values (slice literals, strcat results, etc.) are not freed |
Call free via FFI when needed; keep values on the stack where possible |
| Import is inline-include only: no namespaces, no duplicate guard, alias parsed but ignored | Import each file at most once; prefix names manually |
NURL is designed so that:
NURL = Neural Unified Representation Language
Also: NURL = Non-hUman Readable Language
File extension: .nu