Agent Dev.
It builds the agents. Every role in the fleet is a contract plus a verification loop here — not a "you are a helpful…" persona.
Agent Dev builds the roles the rest of the fleet runs on. Its stance is the whole thing: a reliable agent is a contract plus a verification loop, not a persona. Instead of coaxing a prompt, you declare what a role owns, what it must output, and the checks it must clear — and the runtime revises until it clears them or escalates with a reason.
Standing up a new role.
- The lead decides the shape — which role a project is missing, what it should own, and what it must refuse.
- The coder wires it in — adds the role to the registry and creates its inbox and outbox — no plumbing left by hand.
- The prompt writer authors it — writes the role’s instructions as a tight contract, not a personality.
- The evaluator tests it — runs the role against a small packet to confirm it behaves as intended before it goes live.
- Hot-reload, no restart — the registry and prompts are re-read every turn — a role change or model swap takes effect in about two seconds.
A contract, not a persona.
Every role ships with validators, in two kinds. Deterministic checks enforce hard rules — the output matches the schema, the required fields are there, every citation actually supports its claim, no quote runs past its limit. A rubric check has the model grade the work against named criteria. A rubric alone is never enough: the framework refuses to build a role that doesn't pair it with at least one deterministic check.
When a role can't pass within its iteration budget, it doesn't fabricate — it returns a structured escalation naming the last attempt and the checks that failed. A role that can't clear its own gate says so.
A real failure is the input.
When a role fails a live turn, that exact turn becomes the fix. The real task, the real tool-call trace, and the real failure reason go straight to the improver, which rewrites the role's instructions — then runs the offline loop to confirm the new version converges before it's ever put back online. The fleet improves its own roles from what actually broke, not from a hypothetical.
Runs inside AJO. It’s the project that builds and hardens every other role — including the ones behind Tool Engineering and Discovery. Code is private; this page is the record.