3-Questions Test¶

Type: Decision framework
Referenced from: audit-test-automate-ai-delegation

Definition¶

A task is suitable for full AI delegation when all three conditions hold:

Publicly documented — The domain is learnable from public sources (books, standards, accredited courses). Frontier labs train on everything public, so if it's somewhat public, the LLM likely knows it.
Average result is fine — If the task doesn't need outlier quality, average AI output is acceptable. Example: routine dev status reports pass; LinkedIn posts fail (needs above-average to cut through noise).
Can eval or bound failure radius — Either you can directly judge the output (code review if you're a programmer, design review if you're a designer), or you can define the worst-case downside and keep it contained.

Why All Three Must Hold¶

One structural mismatch kills full automation. A task can pass two questions but fail the third:
- LinkedIn writing passes (1) publicly documented + (3) evaluable, but fails (2) needs above-average quality.
- Health provider report passes (1) + (2), but fails (3) failure radius too large (expulsion risk).

Application to dark-factory-kb¶

This test can be applied to every pipeline task:
- Source ingestion → passes (public URL content, average quality OK, evaluable)
- Concept extraction → passes (patterns are documented, average OK, link check = eval)
- HTML generation → passes (templated, average OK, smoke tests = eval)
- Deploy → passes (standard CF workflow, average OK, health check = eval)

failure-radius — The third question's failure case
delegation-redesign — What to do when tasks fail the test
factory-rules — Factory delegation patterns
audit-test-automate-ai-delegation — Source article

3-Questions Test¶

Definition¶

Why All Three Must Hold¶

Application to dark-factory-kb¶

Related¶