Liria

Genesis · v0.1.0 · OPEN source

Proof-carrying
agents.

Liria is a typed intent compiler and runtime for agents that carry their own proof: every effect is justified against a declared intent, every dispatch passes a deny-by-default membrane, and every governance decision replays byte-identically for a third party.

The authority relation of tool-calling is inverted here. The model’s output never reaches the world directly — it is parsed into typed effect proposals that either pass the membrane or do not exist. The model proposes; the membrane disposes; the receipt outlives both.

$ liria-cli replay verify \
    --original  turn_019e5e80a1f3 \
    --replay    turn_019e5e80a1f3-replay

  Replay completed deterministically
  Replay verified (hash: blake3:7d2b…e904)
  Replay diff: MATCH
·

Testimony is not proof.

After an incident, every serious deployment of autonomous agents faces the same four questions. Logs answer none of them — a log is testimony, emitted voluntarily, by the component under suspicion, in a format it controls, after the fact. Richer tracing makes the testimony queryable. Nothing makes it binding.

Q1 Why did the agent perform this specific side effect?
Q2 Which authority — whose policy, which version, what budget — allowed it?
Q3 Can the incident be replayed exactly, by someone who does not trust the operator?
Q4 Can the capability be revoked fleet-wide, with proof of enforcement?

Liria’s position is structural: the answer must live at the boundary where effects are dispatched, not in a layer that describes them afterward. A receipt is worth the boundary that emitted it.

·

A compiler boundary, not a new runtime.

Liria sits at one boundary: intent → typed IR → primitives. The intent is the goal; the IR is a small, auditable set of artifacts (intent, plan, receipts); the primitives are what actually runs. Nothing has an implicit side effect, and everything is policy-addressable.

Liria author intent, compile to canonical IR + WASM, replay locally
Dragons runs the IR, enforces policy and leases, captures receipts, produces evidence packs

The relationship mirrors a framework and its host platform: the framework compiles to exactly the runtime’s primitives. In the words of the project: “Dragons is the platform for governed autonomous processes; Liria is the framework that targets it.” See how it works →

·

The proof spans before, during, after — and across time.

Each phase is a mechanism in the runtime, not a marketing claim. The names below are the real things you invoke and inspect.

01 Before

Admissibility by construction

An agent declares a typed intent — outcome, constraints, budgets — and a plan whose every effect carries a machine-checkable justification pointing at the intent clauses that require it. The complete possible effect surface is enumerable from the artifact alone, before it runs.

justified effects
02 During

The membrane

Every effect dispatch resolves capability, deny-by-default policy, and budget — and emits a receipt for the decision, including denials. Compiled agents run as wasmtime instances with instruction-level fuel that traps on exhaustion: the budget is a wall, not a log line.

deny-by-default
03 After

Deterministic replay

Every turn writes a content-addressed record — canonical JSON, BLAKE3, chained to its predecessor. Model completions are recorded as inputs, not excuses. replay verify proves a second run reproduces the chain byte-identically, offline, with no key and no trust in the operator.

BLAKE3
04 Across time

Governed evolution

A new agent version is a typed proposal: activated only at a turn boundary, promoted to known-good only after probation, rolled back only toward narrower authority. trust revoke forces a score to 0.00 immediately, and the daemon honors it without a restart.

trust revoke

What replay verifies is deliberately scoped: given these model outputs, these and only these effects were authorized and executed, under this policy version, within these budgets. You cannot prove the model. You can prove the cage — and the cage is what a regulator, a board, or a counterparty actually audits.

·

Proof value = assurance × coverage.

A maximally strong boundary that real work never passes through proves nothing. A certificate layer that spans every runtime but enforces in none proves little more. What matters is the product of the two, summed over the work actually performed — so every receipt is stamped with the enforcement class of the boundary that produced it, and no receipt may claim a class stronger than that boundary.

Class Boundary What a receipt of this class proves
M0 observe evidence imported, no boundary the runtime reported X — nothing more
M1 proxy tool-call interception each intercepted call was policy-checked before it ran
M1+ sandboxed proxy interception + OS sandbox M1, with the side channels around the proxy closed
M2 sandbox instruction-level (WASM) the complete effect surface; exceeding budget is a trap

Honest gradation is the point: today Liria’s strong boundary (M2) hosts compiled, message-passing agents whose LLM use is a recorded effect. The M1 membrane for foreign agents — interposing on the tool boundary of agents you already run — is design-stage, documented in the repo rather than claimed here. The gradient points upward by design.

Genesis

v0.1.0. Pre-release. Honest about it.

Liria is at Epoch 1 — the substrate: IDL structs, the zero-copy bus, the wasmtime kernel, orthogonal persistence, typed effect families under deny-by-default policy, and the IR/replay/trust/daemon command surface. The planner, the economy, and broader production coverage are on the roadmap, not in this release.

Rough edges are documented in the repo rather than hidden. The IR, replay, trust, and daemon paths are exercised end-to-end by the repo’s gate — and cross-language hash parity is enforced by golden vectors in that gate. We tell you where the edges are.

  • Versionv0.1.0 · Genesis
  • LicenseMIT OR Apache-2.0 (OPEN)
  • LanguageRust → WebAssembly
  • Runtimewasmtime + rkyv bus
  • HashingBLAKE3 turn records
  • TargetsDragons (governed runtime)

Compile your first intent.

Build the binaries, validate an example IR, record a turn, and verify the replay hashes byte-identical. Five minutes, one MATCH.