Failure Radius¶

Type: Risk management concept
Referenced from: audit-test-automate-ai-delegation

Definition¶

The failure radius is the worst-case downside when AI output cannot be directly verified. Instead of trying to verify the unverifiable, you bound the damage.

"When verification is structurally impossible, the answer isn't 'verify harder' — it's 'bound the downside.'"

Two Cases of Evaluation¶

Case 1: Verifiable Output¶

If you can judge the work (code if you're a programmer, design if you're a designer):
- Rule: Never spend more time reviewing AI work than you'd spend reviewing human work
- You're the main cost, not the AI — exceeding review time means you've increased cost instead of decreasing it

Case 2: Unverifiable Output¶

When you can't evaluate the output (like judging your CPA's work or your doctor's prescription):
- Traditional solution: operate from trust (known people, known brands)
- AI isn't "people" — so trust doesn't apply
- New approach: Define the failure radius — what's the worst that can happen?

Examples¶

Task	Verification	Failure Radius	Action
Vibe coding (non-coder)	Can't judge code	Run locally, test behavior, keep away from production	Bound by isolation
Comcast credit negotiation	Can evaluate (money back)	Near zero (worst case: Comcast says no)	Safe to automate
Health provider report	Can evaluate (documents)	Too large (expulsion risk)	Add human review step

Application to Agent Design¶

Quality gates = failure radius controls (catch bad output before it propagates)
Human review steps = explicit failure radius bounds for high-stakes tasks
Permission boundaries = scope limits that contain failure radius
The pipeline's NOT-READY gate decision IS a failure radius mechanism

three-questions-test — The third question
quality-gates — Pipeline failure radius controls
autonomy-policy-v3 — When agents can operate without bounds
audit-test-automate-ai-delegation — Source article