Deterministic runtime verification for autonomous agents. When an agent says it did something — sent the email, wrote the file, issued the refund — VERITAS checks whether it actually happened by re-querying the real system, and returns PASS / FAIL / INCONCLUSIVE in plain code. No model grades the result. The thing being judged is never the judge.
Agents that evaluate their own work tend to conclude they succeeded — even when they produced nothing, produced garbage, or reused a stale result. VERITAS is the independent check that sits between an agent's "done" and anyone believing it.
- Deterministic floor. The pass/fail verdict is computed from the contract, the run context, and the raw provider response — with a fixed predicate set, no language model. An optional refutation layer can suggest checks you missed, but it is a separate package that physically cannot change a verdict.
- Ground truth, not a guess. VERITAS re-queries the external state the task should have changed (mailbox, calendar, Drive, payments, a database) and asserts against what it finds.
- It proves the result belongs to this run. The hard part of verification is not "does an email exist" — it is "does this email belong to this run, and not a look-alike (from yesterday, or from a concurrent run)." A top-confidence (EXACT) pass requires an unforgeable, run-namespaced marker the agent had to stamp into something it actually created — so it cannot be earned by reporting an artifact the agent did not cause. A merely agent-recorded id guarded by a server timestamp proves the artifact is in-window, not that this run caused it, so it sits below the auto-pass floor (BOUNDED) unless the operator explicitly opts in for an id-isolated deployment. Everywhere causation cannot be established, VERITAS fails closed.
- Evidence for every check. Never a bare verdict — the raw state, the candidates it considered, and each assertion's expected-vs-actual, in a reproducible, tamper-evident bundle.
- Honest about its limits. A claim is only verified deterministically when a recoverable correlation key exists. UI-only flows, third-party acceptance, and fuzzy outcomes are reported as out of scope for the floor — never quietly passed.
# An agent claims it wrote today's brief to Drive and recorded what it did.
veritas verify --contract VERIFY.md --run ./run-record.json✗ FAIL daily-brief
✗ brief-doc-exists the recorded file id resolves to a document created 2026-05-28,
outside today's window (correlation guard: VERIFIER_OBSERVED createdTime)
– evidence .veritas/runs/2026-05-30T08-02-11Z.json
The agent reported success and recorded a document id — but it pointed at yesterday's brief. VERITAS caught it because the document's own server timestamp falls outside the run window, and an agent-asserted id alone is never enough to pass.
Early development. Architecture and decisions are public and reviewable:
ARCHITECTURE.md— the full design.docs/adr/— the load-bearing decisions (MADR format).benchmark/— the corpus that scores the verifier (the credibility anchor).
- The verdict floor never calls a model. Enforced as a package boundary + a CI dependency-graph check, not a convention.
- Fail closed. Ambiguity, missing keys, and transport errors never become a silent PASS.
- Additive and non-gating. VERITAS drops onto an existing agent or cron job without owning its loop.
- Dependency-light core. The verdict path is standard-library only.
Apache 2.0 — see LICENSE.