Tool building · inside AJO

Tool Engineering.

An autonomous factory that takes a qualified pain and carries it to a shipped, tested tool — a human stepping in only at four gates.

The factory line — pain to shipped
building
Pain handoff
a qualified pursue verdict comes in
S0 Intake G0 · Daniel
the conductor spins up the product’s own repo
S1 Design G1 · Daniel
a fresh session writes the product direction
S2 Design-research
adversarial; every claim cited and verified
S3 Spec
the spec and feature list — the buildable contract
S4 Validation G2 · Daniel
a prototype, reviewed
S5 Build
the builder writes the app, blind to the tests
S6 Verify + ship G3 · Daniel
the holdout suite runs; it had to be red first
Shipped tool
tested, tagged, live
stage · autonomous
human gate · Daniel
A qualified pain from the Pain Point Pipeline enters at the top. Seven stages later, a shipped tool — with a human deciding only at the four gates.

This is the other half of the loop. The Pain Point Pipeline finds a pain worth building; the factory builds it. A plain Python conductor — no model in the driver's seat — carries the pain through seven stages, spawning a fresh session for each and advancing only when a deterministic check passes. A human enters at four gates. Everything between them runs itself.

7stages from intake to ship (S0–S6)
4human gates — the only places I step in
61factory tests green, every gate with a failing case
0stage transitions decided by a model
01How it works

Seven stages, one shipped tool.

  • S0 · Intake — a qualified pain handoff arrives and the conductor spins up the product’s own git repo. (Gate G0: I promote it.)
  • S1 · Design — a fresh session writes the product direction as a checked artifact. (Gate G1: I approve the direction.)
  • S2 · Design-research — the design is researched adversarially; every claim carries a citation, verified by a separate session.
  • S3 · Spec — the spec and a feature list become the buildable contract the later stages are held to.
  • S4 · Validation — a prototype is built and reviewed. (Gate G2: I kill, pivot, or continue.)
  • S5 · Build — a builder session writes the app — and never sees the acceptance tests.
  • S6 · Verify + ship — the holdout suite runs; it had to be red before the build existed. (Gate G3: I approve live use and ship.)
02The discipline

No model decides its own grade.

A stage advances when a code predicate confirms its artifacts exist and pass — never because a model reported its own work was good. That one rule is the whole architecture. A gate that can't fail can't gate, so every gate has a failing-case test, including each permission rule.

The builder is blind to the test. The acceptance suite lives outside the product's repo, denied to the build session, with a tripwire that audits the transcript afterward. The build can't teach to a test it never sees. And each stage runs in a fresh session with only the permissions it needs — a read-only stage physically can't write.

03The human gates

Four decisions, mine to make.

The factory runs itself between gates. At four points it stops and waits for a call it shouldn't make alone.

G0Intake
promotion — deciding this pain is worth the factory’s time at all.
G1Design
direction — is this the right thing to build before any code is written.
G2Validation
kill, pivot, or continue once there’s a prototype to react to.
G3Verify
live use and ship — the one gate that touches the real world.
04What was hard

Keeping the conductor dumb.

The conductor is plain Python on purpose — the moment a model decides whether to proceed, the guarantee is gone. The hard part was everything around that: spawning headless sessions that actually write files, permission profiles enforced as deny-complements so a read-only stage stays read-only, and a heartbeat that doesn't kill a product just for waiting at a gate. Every bug got pinned with a failing test — which is why the suite is worth trusting.

Where it runs. Inside AJO, fed by the Pain Point Pipeline. Currently attended, one stage at a time. Code is private; this page is the record.

← All work