Failure Radius¶
Type: Risk management concept
Referenced from: audit-test-automate-ai-delegation
Definition¶
The failure radius is the worst-case downside when AI output cannot be directly verified. Instead of trying to verify the unverifiable, you bound the damage.
"When verification is structurally impossible, the answer isn't 'verify harder' — it's 'bound the downside.'"
Two Cases of Evaluation¶
Case 1: Verifiable Output¶
If you can judge the work (code if you're a programmer, design if you're a designer):
- Rule: Never spend more time reviewing AI work than you'd spend reviewing human work
- You're the main cost, not the AI — exceeding review time means you've increased cost instead of decreasing it
Case 2: Unverifiable Output¶
When you can't evaluate the output (like judging your CPA's work or your doctor's prescription):
- Traditional solution: operate from trust (known people, known brands)
- AI isn't "people" — so trust doesn't apply
- New approach: Define the failure radius — what's the worst that can happen?
Examples¶
| Task | Verification | Failure Radius | Action |
|---|---|---|---|
| Vibe coding (non-coder) | Can't judge code | Run locally, test behavior, keep away from production | Bound by isolation |
| Comcast credit negotiation | Can evaluate (money back) | Near zero (worst case: Comcast says no) | Safe to automate |
| Health provider report | Can evaluate (documents) | Too large (expulsion risk) | Add human review step |
Application to Agent Design¶
- Quality gates = failure radius controls (catch bad output before it propagates)
- Human review steps = explicit failure radius bounds for high-stakes tasks
- Permission boundaries = scope limits that contain failure radius
- The pipeline's
NOT-READYgate decision IS a failure radius mechanism
Related¶
- three-questions-test — The third question
- quality-gates — Pipeline failure radius controls
- autonomy-policy-v3 — When agents can operate without bounds
- audit-test-automate-ai-delegation — Source article