Overview
RukaLang is an experimental language with mutable value semantics and staged metaprogramming inspired by MetaOCaml.
The fastest way to understand current direction is:
- README.md for project status and entry points.
- Language design reference for language/runtime semantics and current MVP direction.
- Metaprogramming reference for staging model details.
As this book grows, these details will be moved into versioned chapters that are validated in CI.
Guide
This section covers day-to-day usage, browser tooling, CI, and contributor workflows.
Build and Run
Build the compiler:
cargo build
Run a canonical example:
cargo run -- examples/basics.rk
Useful CLI invocations while developing:
cargo run -- --dump-ast examples/basics.rk
cargo run -- --dump-hir examples/basics.rk
cargo run -- --dump-tokens examples/basics.rk
cargo run -- --emit-rust=target/out.rs examples/basics.rk
cargo run -- --emit-wat=target/out.wat examples/basics.rk
cargo run -- --emit-wasm=target/out.wasm examples/basics.rk
cargo run -- --run examples/basics.rk
Run the full local CI check suite:
./scripts/ci.sh
Web Playground
The browser playground lives in
web/. It loads
examples/basics.rk by
default, compiles in WASM when you click Compile, and shows AST,
HIR, MIR, emitted Rust/WAT, and run results from generated WASM.
Only validated WASM is executed; if WASM generation fails, Rust/graphs still
render and diagnostics are shown in the output panel.
From the repository root:
cd web
npm install
npm run build:wasm
npm run dev
Local routes match production layout:
npm run dev runs full mdBook + rustdoc staging once via predev, so
Docs and Rustdoc start from current local
artifacts. During dev, docs watcher updates run lightweight mdBook/theme sync
only (no rustdoc rebuild), so CSS/theme changes still stay in sync without
heavy rebuilds.
Run browser E2E tests (Chromium + Firefox):
cd web
npx playwright install chromium firefox
npm run test:e2e
CI and Deploy
Run the full local check suite:
./scripts/ci.sh
./scripts/ci.sh currently runs:
cargo testcargo test --doccargo test --manifest-path crates/rukalang_wasm/Cargo.tomlmdbook build docsmdbook test docsmdbook-linkcheck --standalone docs
Browser E2E tests are local-only and are not part of CI:
./scripts/e2e-local.sh
Equivalent manual commands:
cd web
npx playwright install chromium firefox
npm run test:e2e
Codeberg Pages deployment runs from
/.forgejo/workflows/pages.yml
on push to main, publishes the built site output to the pages branch, and
relies on the repository webhook for https://ruka.codeberg.page/RukaLang/ to
refresh the site.
The deploy workflow publishes rustdoc with private items enabled using:
cargo doc --no-deps --document-private-items
Published routes:
- Homepage
- Browser playground
- mdBook documentation
- Rustdoc (public + private items)
Documentation Workflow
RukaLang documentation sync is enforced with executable checks:
cargo test --docvalidates Rust doc examples.cargo testvalidatesrkcode fences in documentation chapters.npm --prefix web run test:wasm-apivalidates browser WASM API smoke behavior.mdbook build docsensures the book renders.mdbook test docscompiles runnable Rust snippets in the book.mdbook-linkcheck --standalone docsfails on broken links.
For feature changes:
- Update API docs near code using
///and//!where relevant. - Update guide/reference pages in the docs source tree when behavior changes.
- Keep command examples aligned with
cargo run -- --helpand tests.
RukaLang snippet fences use these tags:
rk: example must compile.rk,run: example must compile and run.rk,fail: example must fail to compile.
Rustdoc link convention for mdBook pages:
- Use path-relative links rooted at
../../rustdoc/from docs chapters. - Prefer reference-style links at the bottom of the page for readability.
- Example:
- in text:
[`LowerMirPass`][rustdoc-lower-mir-pass] - link target:
[rustdoc-lower-mir-pass]: ../../rustdoc/rukalang/driver/passes/struct.LowerMirPass.html
- in text:
If a change intentionally requires no docs updates, explain why in the pull request.
Run docs checks locally:
mdbook build docs
mdbook test docs
mdbook-linkcheck --standalone docs
./scripts/ci.sh includes these checks.
Reference
Reference pages document stable user-facing behavior and command-line interfaces, including the browser-facing WASM API, staging model, and MIR representation.
- Language Design (MVP)
- Array and Slice Design
- Goals and Core Model
- Ownership, Borrowing, and Types
- Expression and Call Semantics
- Storage and Runtime Model
- Validation and Diagnostics
- Conventions and Roadmap
- CLI Flags
- Pass Snapshot Schema
- Browser WASM API
- Metaprogramming
- MIR
Language Design (MVP)
This chapter captures the current language and runtime design direction for RukaLang.
Metaprogramming/staging design notes are maintained in
docs/src/reference/metaprogramming.md.
Sections
- Array and Slice Design
- Goals and Core Model
- Ownership, Borrowing, and Types
- Expression and Call Semantics
- Storage and Runtime Model
- Validation and Diagnostics
- Conventions and Roadmap
Array and Slice Design
This page documents the V2 array/slice model as it is implemented today, plus the remaining backend ABI work.
For implementation-level type normalization and coercion policy, see:
docs/src/internals/ownership-representation.md.
Goals
- Make fixed-size and runtime-sized arrays explicit and internally consistent.
- Reserve the word "slice" for borrowed/view semantics only.
- Keep ownership mode on
TypeRef(T,&T,@T) unchanged. - Avoid unnecessary runtime checks and copies.
- Encode slice values in WASM ABI as two value slots (
i32pointer +i32length).
Surface Language Model
The source syntax remains:
[T; n]for static arrays.[T]for runtime-sized sequence type syntax.
Ownership is still controlled by parameter/local mode markers:
Tview mode.&Tmutable borrow mode.@Towned mode.
For parameters:
[T]means read-only view of sequence data.&[T]means mutable borrow of sequence data.@[T]means owned runtime-sized array.[T; n]means a compile-time-sized read-only view.&[T; n]means a compile-time-sized mutable borrow.@[T; n]means owned compile-time-sized array.
Return annotations do not use ownership sigils. Return values are always owned.
Semantic Type Kinds
The redesign uses separate base type concepts. Ownership/mutability is provided
by access mode (View, MutBorrow, Owned), not by the base type itself.
-
StaticArray<T, N>- Fixed-size value type.
- Length known at compile time.
- No runtime length header is needed for the value itself.
-
DynamicArray<T>- Owned runtime-sized array value.
- Heap allocated.
- Carries runtime length metadata as a header before the elements.
-
Slice<T>- Borrow/view only.
- Runtime representation is
(data_ptr, len). - May point into static arrays or dynamic arrays.
- Never owns backing storage.
-
StaticSlice<T, N>- Fixed-extent view window used in normalized ownership paths.
- No new surface syntax; this is an internal base-type concept used for
[T; N]and&[T; N]in non-owned positions. - Runtime ABI representation is a thin pointer (
data_ptr) becauseNis compile-time known. - May point into static arrays, dynamic arrays, or slice views after checks.
Ownership Interpretation
Given the type constructor shape above:
- View mode (
T) reads from caller-visible storage. - Borrowed mode (
&T) is mutable borrow. - Owned mode (
@T) creates an owned value by copying or moving.
Examples:
x: [i64; 4]is a read-only view ofStaticSlice<i64, 4>.x: &[i64; 4]is a mutable borrow ofStaticSlice<i64, 4>.x: @[i64; 4]is an ownedStaticArray<i64, 4>value transfer.x: [i64]is a viewSlice<i64>.x: &[i64]is a mutableSlice<i64>borrow.x: @[i64]is an ownedDynamicArray<i64>transfer.
Read-only versus mutable for view/borrow forms is encoded by ownership mode, not by changing the underlying collection type constructor:
- View mode (
T) means read-only access. - Borrow mode (
&T) means mutable access. - Owned mode (
@T) means owned value transfer.
Coercion and Compatibility Rules
Coercions are defined over normalized pairs (BaseTy, AccessMode) and consumed
by both checker and MIR lowering.
Value Coercions
StaticArray<T, N> -> DynamicArray<T>is allowed.- Requires allocation of dynamic storage and element transfer/copy.
DynamicArray<T> -> StaticArray<T, N>is allowed.- Requires runtime length check (
len == N). - Traps on failure.
- Produces static-array storage for the destination.
- Requires runtime length check (
Normalized view of the same rules:
(StaticArray<T, N>, Owned) -> (DynamicArray<T>, Owned)=RequiresMaterialization.(DynamicArray<T>, Owned) -> (StaticArray<T, N>, Owned)=AllowedWithRuntimeCheck(len == N)+RequiresMaterialization.
Borrow/View Coercions
StaticArray<T, N> -> Slice<T>is allowed without copying.DynamicArray<T> -> Slice<T>is allowed without copying.StaticArray<T, M> -> StaticSlice<T, N>is allowed whenM >= N.- No runtime check.
- No copy.
StaticSlice<T, M> -> StaticSlice<T, N>is allowed whenM >= N.- No runtime check.
- No copy.
Slice<T> -> StaticSlice<T, N>is allowed.- Compiler inserts runtime check
len >= Nunless statically proven. - Passing a longer slice is valid.
- Compiler inserts runtime check
DynamicArray<T> -> StaticSlice<T, N>is allowed.- Compiler inserts runtime check
len >= Nunless statically proven. - No copy when used as a borrow/view coercion.
- Compiler inserts runtime check
StaticSlice<T, N> -> Slice<T>is allowed without copying.- Length is materialized as compile-time constant
Nin generated code.
- Length is materialized as compile-time constant
Normalized view of borrow/view coercions:
(StaticArray<T, M>, View|MutBorrow) -> (StaticSlice<T, N>, View|MutBorrow)=AllowedNoCheckwhenM >= N.(StaticSlice<T, M>, View|MutBorrow) -> (StaticSlice<T, N>, View|MutBorrow)=AllowedNoCheckwhenM >= N.(Slice<T>, View|MutBorrow) -> (StaticSlice<T, N>, View|MutBorrow)=AllowedWithRuntimeCheck(len >= N)unless statically proven.(DynamicArray<T>, View|MutBorrow) -> (StaticSlice<T, N>, View|MutBorrow)=AllowedWithRuntimeCheck(len >= N)unless statically proven.
Static-Length Reference Behavior
&[T; N]or view[T; N]can be represented as a thin pointer.- When source length is already known to satisfy
>= N, no runtime check is needed. - When source length is runtime-known (for example from
Slice<T>), compiler inserts a runtime check.
Index and Range Semantics
xs[i]reads elementiusing normal bounds policy.xs[a..b]always producesSlice<T>(borrow/view).- Slice ranges always refer to existing storage and carry
(ptr, len). - Creating an owned array from a range requires an explicit owned conversion path.
No implicit owned copy is created for range results.
Allocation and Storage Model
Static Arrays
- Default storage for non-boxed static arrays is stack-like aggregate storage.
- In direct WASM backend this means shadow-stack local storage when needed.
- Static arrays do not use dynamic array headers.
Dynamic Arrays
- Always heap allocated.
- Always carry runtime length metadata in header.
- Release logic follows owned heap object rules.
Slices
- Non-owning view values only.
- Represented as pointer+length pair.
- Static-sized references use
StaticSlice<T, N>and are represented as thin pointers;Nis carried in type metadata, not runtime payload. - Never allocate by themselves.
- Never require retain/release ownership operations.
WASM ABI Contract (Current)
ABI projection is derived from normalized (BaseTy, AccessMode) forms.
Core Mapping
- Scalars keep current scalar mapping.
- Static-array references (
[T; N]in view/borrow positions) are thin pointer ABI values (i32). - Dynamic-array owned values (
@[T]) use pointer ABI value (i32) to heap object with length header followed by the elements. - Slice values currently lower as pointer-sized ABI values in direct WASM.
- Static-array and dynamic-array owned values also lower as pointer-sized ABI values in direct WASM.
Equivalent normalized mapping:
(StaticSlice<T, N>, View|MutBorrow)-> onei32(thin pointer concept in normalization).(Slice<T>, View|MutBorrow)-> currently onei32runtime handle in direct WASM backend.(DynamicArray<T>, Owned)-> onei32heap handle.
Returns
- Borrowed returns are rejected before ABI planning.
- Owned slice returns currently follow the aggregate out-slot return path in direct WASM.
- Tuple/struct/static-array aggregate returns use out-slot rules.
Shadow Stack
- Owned slice values currently participate in shadow-stack aggregate handling in direct WASM.
- Shadow stack remains for aggregate values that require addressable temporary storage.
MIR and Backend Representation Notes
The implementation is expected to update MIR type/instruction modeling so that:
- Dynamic arrays and slices are different concepts.
- Slice-producing instructions return slice-pair values.
- Call lowering can pass and return slice pairs directly.
- Heap ownership inference excludes slices.
- Heap ownership for static arrays only applies when storage is actually heap-owned (for example boxed paths), not by default.
Runtime Check Insertion Policy
Compiler-inserted checks are required whenever static bounds are not proven.
Examples:
Slice<T> -> &[T; N]checklen >= N.DynamicArray<T> -> StaticArray<T, N>checklen == N.- Index/range operations maintain existing bounds safety behavior.
StaticArray<T, 8> -> &[T; 4]requires no check and no copy.Slice<T> -> &[T; 4]checkslen >= 4; on success it passes a thin pointer.
When compile-time facts prove the check condition, the check is omitted.
Implementation Status
Implemented:
- Ownership normalization uses explicit
StaticArray,DynamicArray,Slice, andStaticSlicebase kinds. - Shared coercion matrix drives checker and MIR boundary decisions.
- Runtime length checks are emitted for:
DynamicArray<T> -> StaticArray<T, N>(len == N)Slice<T>|DynamicArray<T> -> StaticSlice<T, N>(len >= N)
- Runtime trap path for failed coercion checks uses
std::panic. - Return ownership sigils are disallowed; borrowed returns are rejected.
Remaining backend ABI work:
- Move direct WASM slice view ABI to explicit
(ptr, len)multi-slot passing and returning. - Remove slice dependence on aggregate out-slot/shadow-stack paths where the value can be represented directly in locals/results.
Non-Goals
- No new user-facing keywords.
- No parser pre-processing.
- No syntax split for "minimum-length slice" types in this proposal.
Goals and Core Model
Goals
- Use mutable value semantics (MVS): values are logically independent, and each value can be mutated.
- Avoid tracing garbage collection.
- Avoid manual memory management in user code.
- Keep copies explicit at the assignment/operator level with predictable eager-copy behavior.
- Keep memory behavior deterministic.
Core Model
- Mutable Value Semantics - value semantics, no visible aliasing, no identity, but values are locally mutable.
- No garbage collector or manual memory management - deterministic ownership and drop in generated code.
- Second-class references - references may only be created in function and block signatures and may not outlive the function/block invocation.
- Keep ownership costs predictable - borrows and moves avoid extra copy work.
No-Cycle Constraint
The language disallows ownership cycles, and those cycles should be impossible to create within MVS.
- Composite ownership graphs must be acyclic.
- Strong back-references that would create cycles are invalid.
Because cycles are disallowed, ownership-based drop remains sufficient for reclamation.
Ownership, Borrowing, and Types
Type-Level Ownership Modes
T: read-only view parameter (default for function parameters).&T: mutable borrow parameter.@T: owned parameter.
Context and Surface Forms
- In type positions (parameter/return annotations, type terms),
Name[...]is a type application/constructor form (for examplePair[i64]). - Pointer indirection is written as
*Tand is non-nullable. - Pointer allocation is written as
@box(expr)and produces*Twhenexpr: T. Option[T]is a built-in optional type constructor. UseOption[*T]for nullable box references.*Trepresents one explicit heap edge to inlineT, so the immediate payloadTcannot itself be another pointer or a built-in heap-handle value.- Built-in sequence type forms:
[T; N]fixed-size array[T]runtime-sized array
- In expression positions,
name(...)is a call expression. - Resolution is context-driven and validated in later semantic passes.
Ownership markers also exist in both spaces:
- Type-level:
T,&T,@T. - Local/value-level:
let x = expr,let @x = expr,&x,<-xforms.
Copyability Classes
- Values are copyable by default.
- Composite types may be explicitly declared
linear(non-copyable). - In MVP, a user-defined
lineartype must contain at least one linear field (directly or transitively through contained types). - Copyable types may not contain linear fields (transitively).
- Linear values can be moved and borrowed, but not copied.
- Copyable containers (
Array, slices,Map) reject linear element values in MVP.
Array literals ([a, b, c]) construct fixed-size arrays by default. When a
[T] type is expected by context, the same literal form constructs an owned
runtime-sized array.
Return annotations do not accept ownership sigils. Return values are always owned.
Local Binding Rules
let x = exprintroduces a read-only view local.let &x = exprintroduces a mutable reference local.let x <- yintroduces a read-only local that now owns the moved value fromy.let @x = exprintroduces an owned local.- Plain
letlocals cannot be assigned through, mutably borrowed, or moved with<-x, even when initialized with<-. - Mutable reference locals (
let &x = expr) may be assigned through and mutably borrowed. - Owned
let @xlocals may be assigned through, mutably borrowed, and explicitly moved.
Read-only parameters and read-only locals are intentionally different:
- A parameter of type
Tmay alias caller storage because it is only a read-only view. - A local created with
let x <- ybecomes the new read-only owner ofy's storage after the move.
Borrowing Rules
- Plain
Tparameters and binders are read-only views. They cannot be assigned through and do not move ownership. &Tborrows are exclusive mutable borrows; they cannot coexist with any other mutable access to the same value.- Mutable borrow overlap checks are place-aware for struct fields and conservative for index/slice projections.
@Tparameters receive an owned value. Plainxcopies into@T, while<-xmoves and invalidates the source binding.- Borrowed values are non-owning and may not escape the function or block that created them.
Lifetime and Destruction
- Heap values are reclaimed when reference count reaches zero.
- No tracing collector is used.
- Frees of large object graphs can still cause long delays (for now).
- Generated Rust and WASM implement box mutation with uniqueness checks before mutable borrows.
For compiler implementation details, see Borrow Checking (Internals).
Expression and Call Semantics
Assignment Semantics
The language has two assignment operators:
=means logical value copy.<-means ownership transfer that invalidates a named source binding.
Syntax Snapshot (MVP)
- Function declarations:
fn name(p1: T, p2: &U, p3: V) -> ReturnType { ... }- Return types are required in function signatures.
- Return annotations do not accept ownership sigils; return values are always owned.
- Assignment:
let b = a(read-only view local; source stays valid)let @b = a(owned local initialized froma)let b <- a(read-only view local initialized by move; source is invalid afterward)let @b <- a(owned local initialized by move; source is invalid afterward)let b <- exprandlet @b <- exprare invalid whenexpris not a named placeplace.field = exprupdates a struct field through a named place path
- Array and slice types:
[T; N]is a fixed-size array type[T]is an owned dynamically sized array type
- Array literals:
[e1, e2, ...]is an array literal (Rust-like)[]is an empty array literal and needs array type contextf(<-x)explicit move from named bindingf(rvalue_expr)implicit move from rvalue
- Expression statements:
expr;discards the expression result regardless of type
- Integer arithmetic:
- unary
-xrequiresx: i64and returnsi64 - binary
+,-,*,/,%requirei64operands and returni64 - precedence follows
* / %above+ -, and unary-binds tighter than both
- unary
Copy Assignment (=)
- Source and destination remain valid after assignment.
- Copy assignment is valid only for copyable values.
- Runtime performs an eager owned copy for copyable heap-backed values.
Local Declarations
let x = exprcreates a read-only view local.let &x = exprcreates a mutable reference local.let x <- ycreates a read-only local that owns the moved value fromy.let @x = exprcreates an owned local.- Assigning to
xor borrowing&xrequireslet @xorlet &x = ...; moving<-xstill requireslet @x. - Assigning to
xis valid whenxwas declared withlet &x = .... let x <- yinvalidatesy, butxstill remains read-only after the move.
Read-only parameters and read-only locals do not have identical storage semantics:
- A parameter
x: Tmay read directly from caller-owned storage. - A local
let x <- ybecomes the new read-only owner after the move and does not alias a still-live source binding.
Move Assignment (<-)
<-is only valid when the source is a named binding/place.- Destination receives ownership and source becomes invalid.
<-on rvalues/temporaries is invalid (there is no name to invalidate).- Using moved-from source is a runtime or compile-time error (depending on checker stage).
Arrays and Slices
[T; N]is an owned value and participates in=,<-, and<-xlike other owned values.[T]is an owned runtime-sized array value and is represented as an owned contiguous sequence.- In parameter mode
T,[T]means a read-only slice view. - In parameter mode
&T,[T]means a mutable slice view. - Slice parameters can accept array arguments with element-compatible item types.
@box(expr)allocates a non-null pointer value of type*T.@array(init, len)constructs heap arrays ([T]) by inferringTfrominit.@as(T, x)performs compile-time-safe numeric casts only.@intCast(T, x)performs checked integer casts and traps on overflow.@intToFloat(T, x)converts integer values to floating-point values.@trunc(T, x)allows narrowing integer-to-integer and float-to-float casts.
Checked cast edge cases: @intCast(i8, 120i16) succeeds, while
@intCast(u8, -1i16) and @intCast(i8, 255u16) trap.
Expression-Oriented Semantics
The language is expression-oriented (Rust-style): control-flow constructs and blocks are expressions.
Unit Result
- Unit is implicit in syntax (no required
()literal in MVP). - A block with no tail expression yields unit.
whileyields unit.
if Expression Rules
if (cond) { then } else { other }is an expression.if (cond) { then }(noelse) is allowed only ifthenyields unit.- With
else, both branches must yield compatible result categories.
Pointer and Option Semantics
Pointers are non-null owned handles used for explicit indirection.
- Pointer types are written as
*T. @box(expr)constructs a pointer value of type*Twhenexpr: T.Option[T]is the built-in optional type constructor; useOption[*T]when a boxed value may be absent.*exprdereferences a pointer expression and reads the pointee as a value.- Field/index projection through a pointer base implicitly dereferences as needed (for example
node.next.valuewherenode.next: *Node). - Passing
&bwhenb: *Tborrows the pointee as&T. &*bis rejected.*Tadds exactly one explicit heap indirection whose allocation storesTinline.- The immediate payload
Tcannot itself be another pointer or a built-in heap-handle type such asStringor owned runtime-sized array values.
Pointer and Optional Construction Examples
let @head = Some(@box(Node { value: 1, next: None() }));match (head) { Some(node) => node.value, None() => 0 };
Result Discard Rules
- Any expression statement form
expr;discards the result. let _ = expr,let _ <- expr, andlet @_ <- exprare ordinary bindings to_(not special discard forms).
Iteration Semantics
The language does not expose first-class references, so iterator design avoids storing borrowed references in user-visible iterator objects.
for Forms
for (collection) |x| { ... }- Plain binder
|x|is a read-only view of each element.
- Plain binder
for (collection) |&x| { ... }- Mutable-borrow binder
|&x|follows&Tsemantics (exclusive mutable borrow per iteration).
- Mutable-borrow binder
for (<-collection) |x| { ... }- Consuming traversal form;
<-collectioninvalidates the named collection binding. - Elements are yielded to
xusing the normal plain-binder read-only view semantics.
- Consuming traversal form;
collection may be [T; N] or [T].
Normalization Rule
for (<-collection) |item| { ... }consumescollectionand invalidates its binding.for (collection) |<-item| { ... }is invalid; iteration does not move elements out directly.
Single-Form Principle
- Redundant forms are disallowed to keep one canonical way to express behavior.
<-is valid on the iterable expression only for consuming traversal from a named collection.
Call Semantics
Index and Slice Access
xs[i]reads through a read-only view.xs[a..b]produces a read-only slice view.&xs[i]and&xs[a..b]request mutable borrows.let x = place_exprandlet &x = place_exprcan bind references to fields/indexed items for the binding scope.- Local mutable borrows are checked for overlap: disjoint struct fields may coexist, while index/slice projections on the same root are conservatively treated as overlapping.
- Copyable indexed elements may still be copied into owned locals or other owned contexts when required.
- Plain slice views are not copied into owned values implicitly.
Argument Forms
f(x)- Valid for
T(read-only view) and@T(owned copy).
- Valid for
f(&x)- Valid only when parameter type is
&T.
- Valid only when parameter type is
f(<-x)- Explicit move into an owned parameter (
@T), invalidating named bindingx.
- Explicit move into an owned parameter (
f(rvalue_expr)- For
T, bare rvalues are borrowed for the duration of the call. - For
@T, bare rvalues are passed as owned temporaries.
- For
Parameter Mode Rules
For a parameter declared as T:
- Passing
f(x)borrows a read-only view. - Passing
f(&x)orf(<-x)is invalid. - The callee cannot assign through the parameter binding.
For a parameter declared as &T:
- Passing
f(&x)creates a mutable borrow. - Other argument forms are invalid.
For a parameter declared as @T:
- Passing
f(x)copies into a fresh owned value. - Passing
f(<-x)moves ownership and invalidatesx. - Passing
f(rvalue)moves the temporary into the callee.
<- exists only to invalidate named bindings; temporaries already have no source binding to invalidate.
Copy Strategy
Copyable heap aggregates (String, arrays, records) use eager copy semantics
for = and owned argument copies.
Linear values cannot have aliases other than statically-checked borrows.
Storage and Runtime Model
Storage Model (v0.1)
Storage representation is an implementation detail, but the runtime should follow these rules.
- Semantics stay value-based (no observable pointer identity).
- Layout for sized types is deterministic (
size,align, field offsets). - Nested sized structs are stored inline.
- Built-in indirection type constructor:
*T. *Tis non-nullable and represented as the address of one heap allocation containingTinline.Option[*T]carries nullable box references at the language level and uses option enum layout with pointer niche optimization in backends.- User-defined recursive/non-stack-allocatable types must use explicit
*Tat recursion/indirection points. - v0.1 uses no implicit boxing for user-defined type recursion.
Inline vs Handle-Backed
- Inline storage is used for small/sized copyable values in locals and parameters.
- Built-in aggregates (
String, slices, arrays) manage their own internal storage and do not require*T. - Pointer-backed storage (
*T) is used for explicit user-directed indirection and recursive graph-like user-defined data. - The immediate payload of
*Tmust not itself be a built-in heap-handle type.
Copyable and Linear Storage
- Copyable heap-backed values use eager copies at copy boundaries.
- v0.1 rule: non-stack-allocatable linear user-defined values require explicit
*Tindirection. - Linear values are move-only, but borrow safety still applies.
Iteration and Move-Out
- Iteration does not move elements out directly.
- Consuming traversal is expressed by moving the container binding (
for (<-collection) |x|).
Runtime Representation
Value Categories
- Immediate: inline scalar (
i64,Bool, small enums). - Heap object: pointer to cell with metadata header.
Heap Header (minimum)
- Type tag/layout id.
- Flags.
Current implementation status:
- Generated Rust pointer copies clone pointee values into fresh cells.
- Generated WASM pointer copies allocate fresh cells, and release frees pointer cells on drop.
- Pointer release also recursively walks heap-backed pointee payloads before freeing the outer pointer cell.
- Generated WASM runtime strings are stored in linear memory; literals stay static and owned string values are freed on drop.
- Generated WASM array storage frees backing allocations on drop.
- Array release walks nested pointer/string/array elements before freeing the outer storage.
- Nested tuple/struct aggregate fields are recursively walked during release, then aggregate storage is freed.
Interpreter/VM Requirements
Minimum conceptual bytecode operations:
ASSIGN_COPY dst, srcASSIGN_MOVE dst, srcBORROW_RO dst, srcBORROW_MUT dst, srcDROP x- field/index mutation ops
The VM must emit precise drops at all scope/control-flow exits.
Runtime IR Boundary (v0.1 direction)
- Runtime execution must not rely on source-level
TypeExprannotations. - Type/mode validation is performed in a checker pass before runtime IR generation.
- Checker failures are hard errors and must stop execution.
- Runtime IR carries only execution data (locals, blocks, ops, control-flow edges), plus dynamic runtime values.
- Runtime checks remain for dynamic semantics (use-after-move, borrow exclusivity/lifetime, Bool condition enforcement).
Runtime IR Shape (Wasm-like, custom)
- Use block + terminator CFG shape similar to Wasm control flow.
- Keep ownership effects explicit as IR ops (
Copy,Move,BorrowRo,BorrowMut,Drop). - Prefer dense entity IDs (
cranelift_entity) and contiguous storage to minimize pointer chasing. - Resolve call targets to function IDs during lowering (avoid runtime name lookup in hot paths).
Validation and Diagnostics
Safety Rules
- Use-after-move is a compile-time error.
- Writing through read-only borrow is a compile-time error.
- Borrow escapes are compile-time errors unless explicitly converted into owned values by copying.
- Mutable alias violations are compile-time errors wherever possible; any need for runtime checking will be explicitly called out in the documentation.
- No first-class references; references can never escape or otherwise outlive the function invocation, and may only be created through function or block parameters.
Error Model
For interpreted MVP, runtime checks are acceptable, but they will be called out in comments and documentation.
- Diagnostics should identify binding and operation that failed.
- Move errors should suggest
=(copy) when user intended to keep source valid. - Borrow errors should suggest
&(or loop|&x|) when mutation was attempted through read-only access. - Owned-argument errors should suggest plain
x(copyable borrow/copy semantics) or<-x(move) for named bindings. - Linear-copy errors should suggest
<-x(move) or borrow forms (x/&x). - Map-key errors should state that keys must be copyable and linear keys are invalid.
MVP Checker Matrix (Recommended)
This section defines what is enforced statically versus at runtime in v1.
Static in v1
- Parse and validate ownership mode annotations and required return types in function signatures.
- Parse and validate local ownership markers (
let x,let &x,let @x). - Validate call-site marker compatibility:
- plain
argallowed forT(read-only view) and@T(owned copy). &argallowed only for&Tparameters.<-argallowed only for@Tparameters, and only whenargis a named place.- plain rvalue expressions allowed for
Tand@T(borrow forT, owned temporary for@T).
- plain
- Treat plain
xs[i]andxs[a..b]as read-only access forms; allow copying indexed elements into owned contexts, but reject implicit owned copies of slice views. - Reject obvious invalid borrow targets (
&on non-place expressions). - Reject
&*expr; mutable pointee borrows must use&namewherename: *T. - Reject overlapping mutable/shared uses of the same place while local borrows are live.
- Allow disjoint struct-field borrows in the same scope.
- Treat index/slice borrow overlap conservatively (same root collection overlaps).
- Reject
<-on non-place expressions. - Reject illegal assignment forms, including
<-from rvalues/non-place sources. - Reject
nulland pointer-binding condition forms (if (p) |x| { ... },if (p) |&x| { ... }). - Reject
=when operand value is linear. - Reject linear values in copyable containers (
Array, slices,Map). - Reject linear values as map keys.
- Validate
Some(...)andNone()constructor arity and enforce match exhaustiveness forOption[T]. - Validate
*expronly whenexpr: *T. - Validate
forbinders (|x|,|&x|) and reject invalid binders like|<-x|. - Allow consuming traversal only as
for (<-collection) |x|with named-place move semantics. ifwithoutelseis valid only when thethenbranch is unit.- Enforce that functions declare intended ownership behavior in signatures.
Conformance Examples (v1)
// Assume signatures:
// fn view(x: T) -> T
// fn log(x: T) -> Unit
// fn edit(x: &T) -> Unit
// fn consume(x: @T) -> Unit
let @a = [1, 2]
let @b = a
view(a) // valid: plain arg to T
view(&a) // invalid: &arg requires &T parameter
view(<-a) // invalid: <-arg requires @T parameter
edit(&a) // valid: mutable borrow to &T
edit(a) // invalid: missing & for &T parameter
edit(&[1, 2]) // invalid: & requires place expression, not temporary
consume(a) // valid: copy into @T
consume(<-a) // valid: explicit move into @T; a invalid after this call
consume(&a) // invalid: &arg cannot bind to @T
consume([3, 4])// valid: rvalue to @T moves directly
consume(<-[3,4]) // invalid: <- requires a named source to invalidate
// Assume: linear Handle
let @s = make_handle()
let s2 = s // invalid: linear values cannot be copied
consume(<-s) // valid: move; s invalid after call
set_map_key(s, 1) // invalid: map keys must be copyable; linear values are not valid keys
let @c <- b // valid move assignment; b invalid after move
let d = c // valid read-only view local
let e <- d // valid move into a read-only view local; d invalid after move
let @f <- [5, 6] // invalid: <- cannot be used with rvalue source
consume(<-e) // invalid: e is read-only even though it now owns the moved value
if a { log(d) } // valid: no else and then-branch is unit
if a { view(d) } // invalid: no else requires unit then-branch
if a { view(d); } // valid: expression statement discards return value
view(d); // valid: return value discarded
let _ <- view(d) // valid: ordinary binding using move assignment
let _ = view(d) // valid: ordinary binding using copy assignment
for (d) |item| { log(item); } // valid plain element binding
for (d) |&item| { log(item); } // valid mutable-borrow element binding
for (<-d) |item| { log(item); } // valid consuming traversal; d invalidated
for (d) |<-item| { log(item); } // invalid: cannot move elements out via loop binding
let @node = Some(@box(Node { value: 1, next: None() }))
match (node) {
Some(n) => log(int_to_string(n.value)),
None() => log("empty"),
}
let @p = @box(1)
edit(&p) // valid: &p borrows pointee as &i64 when p: *i64
edit(&*p) // invalid: &*expr is rejected
Expected diagnostics for invalid lines should mention:
- required parameter mode (
T,&T,@T), - provided argument form (
x,&x,<-x, or rvalue), - and one concrete fix suggestion.
Conventions and Roadmap
Standard Library API Conventions
- Read-only APIs use plain
Tparameters. - In-place mutation APIs use
&Tparameters. - Ownership-taking APIs use
@Tparameters.
Example Semantics
let @a = [1, 2, 3] // array literal: [i64; 3]
let @b = a // copy assignment: both valid, eager copy
push(&b, 4) // mutable borrow
len(a) // read-only borrow
let @c <- b // move assignment: b becomes invalid
consume(c) // copy into owned parameter
consume(<-c) // explicit move into owned parameter; c invalid after call
consume([9, 9, 9]) // rvalue array moves directly to matching @ type
let @head = Some(@box(Node { value: 7, next: None() }))
match (head) {
Some(node) => log(int_to_string(node.value)),
None() => log("empty"),
}
if (cond) {
log(c);
}
make_value();
Behavioral guarantee: mutating one logical value never causes visible mutation of another logical value (no visible aliasing).
Implementation Phases
Phase 1: MVP Runtime
- Type modes
T,&T,@Tfor function parameters. - Required function return types.
- Operators
=and<-with defined validity rules. - Deterministic drop for owned values.
- Eager copy on aggregate copy boundaries.
- Non-null
*Tboxes with optionality represented asOption[*T]. - Runtime checks for borrow and move violations.
Phase 2: Ergonomics and Performance
- Better diagnostics.
- Escape analysis for temporary borrow elision.
- Small-string optimization and vector growth optimizations.
- Reduce unnecessary temporary heap traffic.
Phase 3: Static Validation
- Ahead-of-time validation for common move/borrow errors.
- Earlier detection of alias/lifetime violations.
- Optional strict mode with minimal runtime borrow checks in verified code.
Open Design Questions
- Whether local bindings are immutable by default.
- Whether
<-is allowed in destructuring/pattern assignment for v1. - Which diagnostics are mandatory for MVP versus best-effort.
- How/whether to support explicit aliases.
- How to implement custom iterators.
CLI Flags
Primary invocation pattern:
cargo run -- [FLAGS] <input.rk>
Common flags currently used in examples:
--dump-ast--dump-hir--dump-tokens--dump-pass-timings--dump-pass-snapshots--dump-pass-snapshots-json--emit-rust=PATH--emit-wat=PATH--emit-wasm=PATH--run
For now, use cargo run -- --help as the canonical source for
flag behavior.
For machine-readable pass snapshots, see Pass Snapshot Schema.
Pass Snapshot Schema
This page defines the stable JSON lines schema emitted by:
--dump-pass-snapshots-json
JSON Envelope (schema v1)
Each line is one JSON object:
{
"kind": "pass_snapshot",
"schema_version": 1,
"snapshot_kind": "hir_program",
"name": "hir.lower_program",
"detail": "functions=3 exprs=42 stmts=17",
"fields": {
"functions": 3,
"exprs": 42,
"stmts": 17
}
}
Stable top-level keys:
kind: alwayspass_snapshotschema_version: currently1snapshot_kind: stable semantic kind enum valuename: concrete pass namedetail: human-readable summaryfields: machine-readable key/value object
snapshot_kind Values
Stable values in schema v1:
meta_programelab_programhir_programcheck_programmir_programcodegen_rustcodegen_wasm
Field Contracts (schema v1)
meta_program (meta.expand_program):
functions(u64)structs(u64)enums(u64)
elab_program (elab.elaborate_program):
functions(u64)structs(u64)enums(u64)
hir_program (hir.lower_program):
functions(u64)exprs(u64)stmts(u64)
check_program (check.check_program):
signatures(u64)local_symbols(u64)occurrences(u64)
mir_program (mir.lower_program):
functions(u64)locals(u64)instrs(u64)
codegen_rust (codegen.rust.emit_program):
lines(u64)bytes(u64)
codegen_wasm (codegen.wasm.emit_program):
wat_bytes(u64)wasm_bytes(u64)diagnostics(u64)
Compatibility Rules
- Existing keys and
snapshot_kindvalues are stable within schema v1. - New keys may be added to
fieldsin v1, but existing keys keep their meaning. - Any breaking change requires incrementing
schema_version.
Validation Script
Validate captured JSONL output with:
python3 scripts/validate-pass-snapshot-jsonl.py snapshots.jsonl
Example capture + validation:
cargo run -- --dump-pass-snapshots-json examples/basics.rk > snapshots.jsonl
python3 scripts/validate-pass-snapshot-jsonl.py --strict-pass-name snapshots.jsonl
CI integration note:
./scripts/ci.shvalidates any*.jsonlfixtures undertests/fixtures/pass-snapshots/.- If no fixtures exist, this check is skipped.
Browser WASM API
The browser wrapper crate is
crates/rukalang_wasm.
It currently exposes these wasm-bindgen APIs:
compile_for_browser_json(source_name, source_text)analyze_for_browser_json(source_name, source_text)lex_for_browser_json(source_text)
Success payload fields:
ast_graphhir_graphmir_graphrust_sourcewat_sourcewasm_bytes(optionalu8array; present only when binary emission succeeds)wasm_diagnostics(non-fatal backend diagnostics)
Build prerequisite for direct WASM emission:
- run
./scripts/build-runtime-wasm.shbefore Rust or web WASM builds/tests that consume browser artifacts
Error payload shape:
- object with
diagnosticsarray - each diagnostic includes
phaseandmessage - syntax diagnostics may include
lineandcolumn - phase values currently include
module,syntax,meta,check,mir_lower, andcodegen
compile_for_browser_json behavior notes:
- Rust/AST/HIR/MIR artifacts are emitted when frontend + MIR + Rust codegen pass.
- AST/HIR/MIR graph payloads are browser-friendly Cytoscape data rather than DOT text.
- WASM backend diagnostics are reported in
wasm_diagnosticswithout failing the whole compile payload. wasm_bytesis omitted when the current source uses unsupported WASM backend features.wat_sourceis generated from emittedwasm_bytesviawasmprinter, with synthesized names enabled to improve readability; when WASM emission fails,wat_sourceis empty.- Emitted
wasm_bytesare validated withwasmparserbefore they are returned. - Emitted browser WASM exports
run_main, which calls runtimeassert_no_leaksby default after invoking usermain.
analyze_for_browser_json payload shape:
ok: boolean (truewhen no diagnostics were produced)diagnostics: same diagnostic entry shape as compile errorshighlight_spans: lexical/semantic token spans used by the editor
Validate the wrapper crate:
cargo test --manifest-path crates/rukalang_wasm/Cargo.toml
Metaprogramming
This chapter defines the current staging model used by RukaLang.
Goals
- Keep runtime and compile-time behavior clearly separated.
- Keep local type inference and require explicit type annotations only at boundaries.
- Support explicit dynamic behavior through tagged unions, not implicit dynamic typing.
- Enable future self-hosting/compiler-in-language workflows with typed staged code.
Phase Model
- Runtime code executes normally.
- Compile-time code executes only in explicit staged contexts.
- Types are compile-time values (
type) and do not flow as runtime values.
Context-Sensitive Surface Forms
RukaLang intentionally reuses several syntactic forms across runtime and staged contexts.
Syntax notes:
Name[...]is used in type/meta contexts for type application and constructor forms.name(...)is used in expression contexts for function/meta-function calls.- Resolution is context-driven and validated in semantic passes.
Ownership markers appear in both spaces:
- type-level:
T,&T,@T - expression-level:
&x,<-x
Supported First-Pass Syntax
Current parser/evaluator support covers:
meta fndeclarations for compile-time-only functions.meta { ... }blocks as explicit compile-time statement forms.expr { ... }typed runtime-expression builders that produceExpr[T].%{ ... }quote expressions.$exprsplice expressions in staged contexts.$(...)inline runtime splices that evaluate a meta expression and requireCode[...].- pattern
matchover staged values:- direct type patterns for
typevalues (for examplestruct { x: i64 }) - quote patterns
%{ ... }for code values
- direct type patterns for
- code type constructor usage in type position:
Code[T]. - typed expression constructor usage in type position:
Expr[T]. - quoted/runtime struct operations in expressions:
- construction:
Name { field: value, ... } - field read:
value.field - field update statement:
value.field = expr;
- construction:
Common aliases by convention:
Expr[T]for typed runtime expression generation.Unitaliases the empty tuple typeTuple[].
Example
meta fn choose(flag: Bool, yes: Expr[i64], no: Expr[i64]) -> Expr[i64] {
match flag {
true => yes,
false => no,
};
}
fn main() -> Unit {
meta {
choose(true, expr { 4 }, expr { 9 });
};
0;
}
Inline runtime splice example:
meta fn make_message() -> Expr[String] {
expr { "hello" };
}
fn main() -> String {
$(make_message())
}
Current Limitations
- Parser support comes first; some elaboration/type rules remain partial.
$expris parsed in expression positions and must be resolved before runtime lowering.- Runtime cannot consume
typevalues directly. - Type-structure matching supports direct type patterns in
matcharms.
Validation Rules
- Runtime values cannot be used in compile-time-only contexts.
- Compile-time-only values cannot escape into runtime expressions.
- Splice insertion must be type-correct at the quote site.
Phase Boundary Behavior
Meta evaluation is strict: it can only read values introduced in meta
contexts (meta-function parameters, meta let bindings, and meta call
results). Runtime bindings are unavailable in the meta phase.
Valid meta-phase access:
meta fn use_int(k: i64) -> Expr[i64] {
expr { k };
}
fn main() -> Unit {
meta {
use_int(4);
};
0;
}
Invalid runtime-to-meta access:
meta fn use_int(k: i64) -> Expr[i64] {
expr { k };
}
fn main() -> Unit {
let runtime_k = 4;
meta {
use_int(runtime_k); // error: runtime binding unavailable in meta phase
};
0;
}
MIR
This page is the high-level entry point for MIR documentation.
Detailed MIR internals (instruction set, lowering behavior, local slot representation, backend mapping, and naming) live in crate docs so there is one source of truth.
Where to read MIR docs
- Core MIR crate docs source:
crates/ruka_mir/src/docs.md - MIR re-export module in the main crate:
src/mir/mod.rs
Published rustdoc links
- ruka_mir crate docs
- rukalang MIR module docs
MirFunctiondocsMirInstrdocsMirParamBindingdocsMirCallArgBindingdocsMirAggregateArgdocs
If you are looking for naming details like p_*, v_*, and t_* locals (for
example v_1), use the crate docs above; that is where the full explanation is
maintained.
Internals
This section documents implementation details that contributors use when working on compiler architecture.
- Compiler Pipeline
- Nanopass Roadmap
- Pass Inventory
- Calling Conventions
- Borrow Checking
- WASM Shadow Stack
- Ownership Representation
Compiler Pipeline
At a high level, the compiler pipeline is:
- Parse source into syntax structures.
- Expand runtime
meta { ... }forms into runtime AST. - Elaborate/concretize runtime types (instantiate generic struct templates into concrete runtime structs).
- Lower into typed HIR.
- Run checker passes.
- Lower into MIR.
- Emit Rust code and optionally execute it.
Checker invariant: runtime generic types are not supported past elaboration. If a generic runtime type survives into checking, it is treated as an invariant violation and rejected.
See README.md, Language Design (MVP), and the Metaprogramming reference.
Nanopass Roadmap
This document proposes a nanopass-inspired compiler architecture for RukaLang. It is a migration plan, not an all-at-once rewrite.
Assumptions
- The main priorities are maintainability and clarity of invariants.
- Small compile-time regressions are acceptable if we gain much better architecture hygiene.
- We keep the current top-level stage order (
syntax -> meta -> elab -> hir -> check -> mir -> codegen) and split inside stages first. - We preserve current language behavior while we refactor pass structure.
If these assumptions change, we should revise this plan before implementation.
Assumptions confirmed by maintainer (2026-03-18).
Design Goals
- Minimize boilerplate when adding passes and intermediate forms, using macros or other metaprogramming constructs where appropriate.
- Carry source information through the whole pipeline with one shared representation.
- Maximize allocation performance with arena-based storage (
cranelift_entityfirst choice). - Make pass contracts explicit: each pass declares required input invariants and guaranteed output invariants.
- Framework should be split into its own crate to allow potential separate publication & reuse in other compilers later, but supporting RukaLang should be the top priority.
Core Architecture
1) Pass Interface With Low Boilerplate
Introduce a shared pass API with a small descriptor and one run method.
#![allow(unused)] fn main() { pub trait Pass { type In; type Out; type Error; const NAME: &'static str; fn run(&mut self, input: Self::In, cx: &mut PassContext) -> Result<Self::Out, Self::Error>; } }
PassContext carries shared facilities:
- interners,
- source/provenance tables,
- diagnostics sink,
- reusable scratch arenas,
- stats/timing hooks.
To reduce repeated code, add a small helper macro for pass declaration metadata and a pass runner that wires logging/timing/diagnostic framing once.
2) Shared Source + Provenance Representation
Use one representation for all source and expansion provenance:
SourceFileId(entity_impl!newtype)SpanIdpointing into a global span arenaOriginIdfor synthetic/generated nodes
Each IR node stores either:
- a direct compact
SpanId, or - an
OriginIdwhen generated from other nodes.
OriginId resolves through a provenance graph:
Origin::Parsed { span }Origin::Expanded { from: OriginId, phase: PassId }Origin::Lowered { from: OriginId, phase: PassId }Origin::Synthesized { reason, parent: Option<OriginId> }
This gives one diagnostics path from any late IR entity back to user source, including meta expansion and lowering.
3) Allocation Model
Use PrimaryMap for arena-owned entities and SecondaryMap/side tables for annotations:
PrimaryMap<EntityId, Node>for nodes,SecondaryMap<EntityId, SpanId>for direct source,SecondaryMap<EntityId, OriginId>for provenance,- compact index-backed vectors for analysis facts.
Guidelines:
- pre-size maps from cheap counts when possible,
- avoid cloning large subtrees across passes; prefer id remapping tables,
- keep per-pass temporary state in scratch arenas owned by
PassContext, then clear/reuse.
Pipeline Refactor Plan
Current implementation reference:
- See Pass Inventory for the current typed pass list, execution order, and implementation links.
Implementation Status (2026-03-19)
Completed so far:
- Pass framework landed in
src/passwith typed pass execution, pass ids, timing capture, and shared provenance tables (SourceFileId,SpanId,OriginId). - Top-level production pipeline now runs through typed pass wrappers for all
current stages:
meta,elab,hir,check,mir,codegen.rust,codegen.wasm. - Elaboration split-in-progress: core runtime call/template concerns are now
explicit subpasses:
elab.normalize_runtime_calls_and_spreadselab.validate_runtime_call_argselab.bind_template_call_argselab.instantiate_runtime_function
- Per-pass observability landed:
- pass timings (
--dump-pass-timings) - pass snapshots (
--dump-pass-snapshots) - JSONL snapshots with schema/version (
--dump-pass-snapshots-json)
- pass timings (
- Provenance side-table implementation started:
- HIR expression origin side tables
- MIR local origin side tables
- origin chains include
Parsed -> Expanded(meta) -> Lowered(elab) -> Lowered(hir[/mir])
- Browser/WASM analysis path migrated onto driver-based pipeline hooks.
- CI now includes a browser WASM API smoke check to catch runtime regressions in compile/analyze behavior.
Remaining major work:
- Continue decomposition of
elabuntil major mixed-responsibility blocks are isolated behind stable pass contracts. - Start
checkphase split (collect_decls,resolve_signatures, etc.). - Extend provenance mapping to more node/entity kinds and tighten diagnostic source reconstruction quality.
- Add stronger fixture/snapshot coverage for pass contracts and structured snapshot schema stability.
Phase 0: Infrastructure First
Deliverables:
passcrate/module withPass,PassContext,PassId, pass runner.- Shared provenance tables and IDs (
SourceFileId,SpanId,OriginId). - Compiler driver updates to run a pass list and emit pass timing stats.
Exit criteria:
- current behavior unchanged,
- existing tests pass,
- source spans still appear in diagnostics.
Status:
- Complete.
Phase 1: Split elab Into Explicit Subpasses
Current elab mixes many concerns. First split candidate:
collect_runtime_templatesresolve_type_namesinstantiate_runtime_templatesinfer_runtime_expr_typesnormalize_runtime_calls_and_spreadsruntime_type_validation
Each subpass operates over one arena-backed runtime AST form with side tables, not deep cloning.
Exit criteria:
- golden tests for per-pass output snapshots,
- invariants documented for each pass,
- no language behavior drift.
Status:
- In progress. Core runtime call/template concerns now run as explicit elab
subpasses (
normalize_runtime_calls_and_spreads,validate_runtime_call_args,bind_template_call_args,instantiate_runtime_function).
Phase 2: Split check Into Independent Analyses
Suggested decomposition:
collect_declsresolve_signaturescheck_expr_and_stmt_typescheck_loans_and_movesfinalize_checked_program
Store analysis outputs in compact side tables keyed by expression/statement ids.
Exit criteria:
- diagnostics parity for current fixtures,
- checker internals no longer require one giant mutable state object.
Phase 3: Split mir_lower
Suggested decomposition:
build_function_skeletonslower_cfglower_types_and_layoutinsert_runtime_intrinsicsmir_sanity_validation
Exit criteria:
- MIR graph parity on fixture corpus,
- no codegen regressions in Rust/WASM outputs.
Phase 4: Optional Full Nanopass Expansion
After phases 1-3, we can choose finer granularity pass-by-pass.
Decision gate:
- if a pass still has mixed responsibilities or weak invariants, split again,
- if not, keep current granularity.
This keeps a path to full nanopass architecture without forcing every split immediately.
Boilerplate Reduction Strategy
- Use generated
EntityIdnewtypes (entity_impl!) and common arena wrappers. - Keep one pass registration table:
- pass name,
- input/output type ids,
- optional debug dump hook.
- Auto-wire pass logging, timing, and panic context in one runner.
- Reuse traversal helpers for common AST/HIR/MIR walk patterns.
Source/Diagnostics Strategy
- Every emitted diagnostic must carry an
OriginId. - Diagnostics rendering resolves origin chain to best user-facing span.
- If multiple source candidates exist (for generated nodes), render:
- primary span,
- one secondary note with expansion/lowering origin.
This keeps diagnostics robust as pass count grows.
Performance Strategy
- Prefer arena ids over owned recursive trees in inner passes.
- Keep hot tables in flat vectors keyed by entity index.
- Batch allocate nodes and annotations per pass; avoid per-node heap allocations.
- Collect and track pass timing/allocation counters from day one of migration.
Validation Plan
At each phase:
- Run
cargo test. - Run
./scripts/ci.shbefore PR. - Add fixture tests for any new diagnostics surface.
- Add pass contract tests:
- checks for required input invariants,
- checks for guaranteed output invariants.
Decisions (Confirmed)
- Pass errors use per-pass error enums wrapped by one top-level compiler error type.
- Provenance uses one canonical
OriginIdpath; parsed nodes areOrigin::Parsed { span }.- Storage policy: keep
OriginIdin side tables keyed by arena entity ids, not as direct IR node fields. - Rationale: lower node-size overhead, less constructor/pattern-match churn, one shared provenance representation.
- Storage policy: keep
- Subpasses prefer in-place mutation over arena-backed IR + side tables, and emit a new IR only when structure must change.
- Pass registration starts as a static compile-time pass list (typed, no dynamic dispatch).
- Expose pass-level debug dumps through CLI flags in phase 0.
- Persist provenance graph in browser artifacts and revisit graph presentation as pass count grows.
- Use one shared IR node id namespace per stage (not per-module) for maintainability.
Suggested First Implementation Slice
Keep this first slice small and reversible:
- Add
PassContext+ provenance ids/tables. - Wrap existing
elab::elaborate_programas a single pass under new runner. - Split only one
elabconcern (normalize_runtime_calls_and_spreads) into its own pass. - Verify diagnostics parity and benchmark compile time on fixture corpus.
If this slice lands cleanly, continue with the rest of phase 1.
Pass Inventory
Current passes that run through the typed pass mechanism (Pass + PassContext).
Rustdoc links use mdBook-relative paths (../../rustdoc/...).
Top-Level Pipeline (execution order)
meta.expand_program- Expand meta constructs into runtime-facing AST.elab.elaborate_program- Elaborate types/templates in AST.hir.lower_program- Lower elaborated AST to HIR.check.check_program- Semantic/type check HIR.mir.lower_program- Lower HIR + checker facts to MIR.codegen.rust.emit_program- Emit Rust source from MIR.codegen.wasm.emit_program- Emit WAT/WASM artifacts from MIR.
Elaboration Subpasses
elab.normalize_runtime_calls_and_spreads- Rewrite...tuplecall args.elab.validate_runtime_call_args- Validate normalized runtime call args.elab.bind_template_call_args- Bind template call args and build specialization key.elab.instantiate_runtime_function- Instantiate/cache concrete runtime template function.
Calling Conventions
RukaLang has two internal call boundaries that matter for compiler work:
- MIR-level ownership and representation conventions.
- Backend ABI conventions (Rust emission and direct WASM emission).
This page documents the current rules and points to the rustdoc pages where those rules are encoded.
MIR-Level Contract
MIR stores both ownership mode and runtime/storage representation. Together, these define how a value is passed at call boundaries.
- Function parameter ownership is represented by
MirOwnershipMode. - Call-site argument ownership is represented by
MirArgMode. - Parameter boundary metadata is normalized through
MirParamBinding. - Call argument metadata is normalized through
MirCallArgBinding. - Local runtime/storage shape is represented by
MirLocalRepr.
Current boundary semantics:
Viewparameters are source-level read-only access;MutBorrowparameters are mutable borrow access;Ownedparameters are value transfer parameters.MirParamBinding::source_reprandMirParamBinding::local_reprdefine source boundary shape vs lowered local shape.MirParamBinding::requires_materializationmarks parameters where the lowered local representation differs from the source boundary representation.MirParamBinding::materializes_view_from_ownedmarks the current materialized view case where a source owned value is projected into a view-local boundary.MirCallArgBinding::requires_deref_readmarks call arguments that need a load/read from a place local before value passing.- Boundary coercions that require runtime checks currently lower at MIR call
sites with explicit
CollectionLencomparisons andCallExtern("std::panic")on failure.
Core MIR container docs:
Rust Backend Convention
Rust codegen follows Rust-level references/values directly:
- Internal calls (
MirInstr::Call) lower withemit_internal_call_args. - Runtime/extern calls (
MirInstr::CallExtern) lower withemit_extern_call_args.
Behavior by argument mode:
Borrowed(view call arg mode) passes&T(or&*placewhen the local is place-shaped).MutableBorrowpasses&mut T(or&mut *placefor mutable place locals).OwnedMovepasses by move; place reads are cloned from dereference.OwnedCopypasses a cloned value; place reads are cloned from dereference.
For slice place reads copied into owned values, Rust emission uses .to_vec()
instead of (*place).clone().
Entry points:
WASM Backend Convention
The direct WASM backend uses a strict, explicit ABI with normalized value types and out-slot returns for aggregate return values.
Signature shaping is defined by:
Current value mapping:
i64lowers toi64.- Most other runtime values lower to pointer-sized
i32handles (including strings, pointers, arrays, tuples, structs, enums, slices, and references).
Return conventions:
- Non-aggregate mutable-borrow params use an inout convention: argument value in, updated value returned as an extra WASM result.
- Scalar-like returns use normal WASM result values.
- Aggregate returns (currently tuple/struct/slice) use an out-slot pointer
parameter inserted at parameter index
0and no WASM result. - Aggregate temporaries are placed on the runtime shadow stack when required.
For the full shadow-stack lifecycle and memory layout, see WASM Shadow Stack.
Call lowering is defined by:
The call-argument strategy is selected from local representation and arg mode:
- Pass-through by value for normal value locals and mutable-borrow pointer ABI args.
- Dereference-load for mutable-borrow inout args when the source local is a non-passthrough place.
- Dereference-load to match callee ABI when reading from place-shaped locals.
Backend entry points:
Runtime WASM ABI Surface
Runtime-call ABI metadata is centralized in ruka_runtime:
The coercion runtime trap path is exposed as std::panic in this descriptor
table.
This descriptor table is what the WASM backend linker and call lowering use for symbol resolution and runtime signature checks.
Borrow Checking
RukaLang now tracks local borrows with a place-based checker that is smaller than Rust borrowck but follows the same safety shape for overlapping access.
Scope and Goals
- Keep reference semantics simple for MVP.
- Support temporary local references to named bindings, struct fields, tuple fields, indexed elements, and slice ranges.
- Prevent overlapping mutable and shared access to the same storage region.
- Avoid lifetime inference complexity by keeping references local-only (no storing in user data structures and no returning references).
Surface Forms Covered
let x = place_exprcreates a shared local reference whenplace_expris a place expression.let &x = place_exprcreates a mutable local reference.- Existing call-argument borrow forms remain available (
&argfor&Tparameters).
For non-place initializers, plain let x = expr continues to behave like a
normal value initialization.
Place Model
The checker resolves borrowable expressions into a canonical place path:
- root binding name (
x) - zero or more projections:
- field projection (
.field/ tuple index like.0) - index-like projection (
[i]and[a..b]both normalize to one index-like projection)
- field projection (
Examples:
pair.left->pair .field(left)xs[3]->xs .index_likexs[1..3]->xs .index_likepair.left.value->pair .field(left) .field(value)
Active Loans
Each scope keeps a list of active loans:
- loan kind: shared or mutable
- place path
- owner local (the local binding that introduced the loan)
Loans are introduced by local borrow declarations and removed when the owning scope exits.
Overlap Rule
Two places overlap when:
- they have the same root binding, and
- their projections are not proven disjoint.
Disjointness rule used today:
- field-vs-field at the same depth with different field names is disjoint
(
pair.leftvspair.right) - any case involving index-like projection is treated as overlapping (conservative)
- prefix/ancestor-descendant place relations overlap
This matches Rust's conservative behavior for array/slice indexing while still allowing independent struct-field borrows.
Enforced Access Rules
- read of a place is rejected if an overlapping mutable loan is active
- write/move of a place is rejected if any overlapping loan is active
- creating a shared loan is rejected if an overlapping mutable loan is active
- creating a mutable loan is rejected if any overlapping loan is active
Current Limits
- No index disjointness proof (
xs[0]vsxs[1]is still overlapping). - No borrow splitting API yet (Rust-like
split_at_mutequivalent not present). - Checker is lexical-scope based; it does not perform advanced non-lexical lifetime shortening.
These limits are intentional for MVP simplicity.
WASM Shadow Stack
This page explains exactly when the direct WASM backend uses the shadow stack, how frame layout is computed, and how out-slot returns interact with it.
What It Is
RukaLang's direct WASM backend uses a per-call shadow stack frame for aggregate values that should not be heap-allocated for temporary use.
The compile-time decisions are encoded in:
The runtime ABI symbols used to reserve/release the frame are:
Exactly When It Is Used
A local is assigned shadow-stack storage when all of the following are true:
- The local is not a function parameter.
- Either:
- the local is a value local whose type is one of
Tuple,Struct, orSlice, or - the local is a slice place local (
RefRo<Slice<_>>orRefMut<Slice<_>>).
- the local is a value local whose type is one of
This selection logic is implemented by
should_shadow_stack_local and
is_shadow_stack_aggregate_ty.
Frame Layout and Prologue
Frame construction happens in lower_function.
For each selected local:
- Payload bytes are computed via
aggregate_payload_bytes. - Slot size is
align_up(ARRAY_DATA_OFFSET + payload_bytes, 8). - Offsets are assigned in declaration order, each starting at 8-byte alignment.
After all slots are sized:
- If frame size is zero, no runtime shadow-stack calls are emitted.
- If frame size is non-zero, function entry emits one reserve call to
__ruka_rt::shadow_stack_reserve(frame_bytes). - The returned base pointer is kept in a scratch local.
- Each shadow local is initialized to
frame_base + local_offset.
How Instructions Use Shadow-Stack Locals
Aggregate-producing instructions check whether dst is shadow-backed:
lower_aggregate_instrskips heap allocation for tuple/struct/slice destinations and requires those destinations to be shadow-backed.- The instruction then writes aggregate fields directly through the local pointer.
For call destinations:
lower_call_family_instrchecks whether the destination local is shadow-backed and requires out-slot destinations to be shadow-backed.
Out-Slot Returns and Caller/Callee Behavior
Return-type decision:
function_returns_via_out_slotcurrently returns true for tuple/struct/slice return types.- Borrowed/reference returns are rejected before signature planning.
Signature shaping:
signature_typesinserts ani32out-slot pointer parameter at index0when return-via-out-slot is required.- In that case, the WASM result list is empty.
Call-site behavior:
- Caller passes destination pointer as arg
0. - If destination local is shadow-backed, that pointer is reused.
- If destination local is not shadow-backed, lowering fails instead of heap-allocating implicit out-slot storage.
Return behavior:
lower_terminatorhandlesReturn.- For out-slot returns, it copies return bytes from the local storage pointer to the out-slot pointer.
- For non-out-slot returns, it pushes the value as a normal WASM result.
Release and Lifetime Rules
At every emitted Return path in
lower_terminator:
- If the function reserved a non-zero frame, the backend emits one call to
__ruka_rt::shadow_stack_release(frame_bytes)beforereturn.
This gives function-scoped shadow-stack lifetimes:
- Reserve once in function entry.
- Reuse slots for all selected locals in that function.
- Release once on each return path.
Runtime Side Notes
The runtime reserve/release behavior itself is implemented in the wasm32-only runtime module source:
That module currently:
- lazily allocates one backing region,
- bumps a shadow-stack pointer on reserve,
- checks overflow/underflow with assertions,
- and rewinds the pointer on release.
Ownership Representation
This page describes the ownership representation used by checker, MIR lowering, and both backends today.
The canonical model separates:
- type identity (
BaseTy) - access intent at a boundary (
AccessMode) - runtime/local storage representation (
MirLocalRepr,MirHeapOwnership)
That split keeps compatibility decisions in one place while preserving backend- specific lowering details where they belong.
Canonical Ownership Model
Shared ownership modeling lives in ruka_types:
AccessMode:View | MutBorrow | OwnedBaseTy: pure type identity (StaticArray,DynamicArray,Slice,StaticSlice)NormalizedTy:(BaseTy, AccessMode)normalize_ty: convertsTyintoNormalizedTynormalize_ty_with_access: converts oneTyinto a boundary-aware normalized form with explicit access mode
Ty remains the semantic type carried across the compiler, but compatibility and
boundary logic are expressed in terms of normalized ownership data.
Compatibility Decisions
Compatibility/coercion decisions are centralized in ruka_types:
ty_coercion_decision: type-to-type compatibilityboundary_coercion_decision: explicit expected/actual boundary access mode compatibilityCoercionDecision:Denied | Allowed { exact, check, materialization }
Policy fields are independent so decisions can express combinations:
check(CheckPolicy) answers whether runtime validation is requiredmaterialization(MaterializationPolicy) answers whether representation bridging is required
Current runtime-check categories:
Current concrete materialization categories:
MaterializationKind::ViewFromOwnedMaterializationKind::DynamicArrayFromStaticArrayMaterializationKind::StaticArrayFromDynamicArray
Checker Usage
Checker call compatibility uses shared boundary coercion decisions. Ownership
mode is mapped to normalized access mode at call boundaries, then validated by
boundary_coercion_decision.
Primary implementation entry points live in:
crates/ruka_check/src/checker_calls.rs
MIR Boundary Usage
MIR lowering uses one boundary plan path for call arguments and normalized projection for parameter locals.
- Parameter local projection and ownership-mode mapping:
crates/ruka_mir_lower/src/lowerer/helpers.rs - Call argument planning and compatibility usage:
crates/ruka_mir_lower/src/lowerer/call_args.rs
MIR itself exposes boundary helpers so consumers do not duplicate branching:
MirParamBindingexpects_view,expects_mut_borrow,materializes_view_from_owned,requires_materialization
MirCallArgBindingis_borrowed,is_mutable_borrow,is_owned_move,is_owned_copy,requires_deref_read
MirAggregateArgis_owned_move,is_owned_copy
Backend ABI Usage
Rust and WASM backends both consume MIR binding helpers rather than re-deriving ownership semantics independently.
Rust:
WASM:
Notes for Contributors
- Add ownership compatibility behavior in
ruka_typescoercion APIs first. - Prefer MIR binding helper predicates over open-coded mode matching.
- Keep mdBook pages as overview and workflow guidance; put API detail in rustdoc.