3-Questions Test¶
Type: Decision framework
Referenced from: audit-test-automate-ai-delegation
Definition¶
A task is suitable for full AI delegation when all three conditions hold:
-
Publicly documented — The domain is learnable from public sources (books, standards, accredited courses). Frontier labs train on everything public, so if it's somewhat public, the LLM likely knows it.
-
Average result is fine — If the task doesn't need outlier quality, average AI output is acceptable. Example: routine dev status reports pass; LinkedIn posts fail (needs above-average to cut through noise).
-
Can eval or bound failure radius — Either you can directly judge the output (code review if you're a programmer, design review if you're a designer), or you can define the worst-case downside and keep it contained.
Why All Three Must Hold¶
One structural mismatch kills full automation. A task can pass two questions but fail the third:
- LinkedIn writing passes (1) publicly documented + (3) evaluable, but fails (2) needs above-average quality.
- Health provider report passes (1) + (2), but fails (3) failure radius too large (expulsion risk).
Application to dark-factory-kb¶
This test can be applied to every pipeline task:
- Source ingestion → passes (public URL content, average quality OK, evaluable)
- Concept extraction → passes (patterns are documented, average OK, link check = eval)
- HTML generation → passes (templated, average OK, smoke tests = eval)
- Deploy → passes (standard CF workflow, average OK, health check = eval)
Related¶
- failure-radius — The third question's failure case
- delegation-redesign — What to do when tasks fail the test
- factory-rules — Factory delegation patterns
- audit-test-automate-ai-delegation — Source article