Pain Point Pipeline.
One question in. A pursue, hold, or pass verdict out — grounded in real review-site evidence of what people actually pay for.
Most idea validation runs on complaints. This runs on spend. Point it at a domain and it turns one question into an evidence-backed verdict on a buildable pain — every claim tied to a page it actually captured, weighted by proof that someone pays. It's Discovery's live engine, isolated from the old Sextant retrieval stack it replaced.
Six stages, one cited artifact.
- Intake — the query is tagged as discovery with a recency window. No heavy router — the weight is all downstream.
- Sourcing oracle — a model decides where to look: review sites for demonstrated spend, forums for raw complaints, press for market read. This one call decides whether a run finds spend evidence at all.
- Steering — probe queries, domains, and budget are derived, then filtered by a live-capability gate that skips any source whose keys aren’t configured.
- Discovery probes — URLs are found per domain — native APIs for Hacker News, Reddit, and GitHub; site-search for review sites; a Perplexity fallback when neither fits.
- Capture — a 49-adapter layer fetches each page’s text and engagement signals, escalating from a real browser to a residential unblocker for the hardest-walled sites.
- Synthesis — the model reads only what was captured and returns pains with citations — every evidence URL is a page the run actually fetched.
Grounded on spend, not complaints.
A forum tells you people are annoyed. A review site tells you people are paying. The sourcing oracle routes each query toward review sites — G2, Capterra, Trustpilot — where the writing is about tools someone already bought. That's the difference between a complaint and a market.
The routing is deliberate and fragile in a telling way: the query has to name it. Asking for "reviews" once sent every run to Reddit and scored everything a pass; asking for "user reviews" flips it to the sites that prove spend. Every pain the pipeline returns carries a citation to a page it actually fetched — the evidence is a subset of what it captured, or it doesn't ship.
Four scores, one call.
The Researcher layer wraps the pipeline — it writes the qualifying questions, runs each one through, and scores every candidate pain on four dimensions before ranking it pursue, hold, or pass.
A pursue doesn't sit in a doc. The verdict becomes a task — handed to Tool Engineering to get built. That handoff, from discovered pain to shipped tool, is where it gets powerful.
The evidence is behind walls.
The sites with the best spend evidence sit behind the hardest anti-bot walls. A local headless browser gets a challenge page, not reviews. So capture runs an escalation ladder — a real Chrome session first, then a residential unblocker for the walled domains. Getting G2 from a mislabeled "blocked" to 24,000 characters and 80 real reviews was a fetch-layer fight, not a prompt.
The recurring bug had one shape. An over-broad match, or a label that lied about what happened — a capture tagged "browser" when the unblocker really served it. Most of a week's fixes were making the system honest about its own behavior. Code is private; this page is the record.