Factory Pattern vs Anthropic Dynamic Workflows¶
What Anthropic's Dynamic Workflows Actually Are¶
On May 28, 2026, Anthropic shipped Dynamic Workflows as a research preview in Claude Code, bundled with Opus 4.8. The core mechanism:
- Claude receives a natural-language request
- Instead of working turn-by-turn, it generates a JavaScript orchestration script
- A runtime executes that script in the background
- The script spawns coordinated subagents (up to 1,000 total, 16 concurrent)
- Intermediate results live in script variables, not in Claude's context window
- Final output is a consolidated report
Key technical properties:
- Scale: Up to 1,000 agents per execution, 16 concurrent
- Activation: Use the word "workflow" in a prompt, or /effort ultracode
- Repeatability: Scripts saved to .claude/workflows/ and re-executed via slash command
- Resumability: Interrupted workflows resume from cached state — agents pick up where they left off
- Permissions: Subagents always operate in acceptEdits mode; the script itself cannot access filesystem or shell directly
- No mid-execution user input: Only permission prompts can pause a workflow
- Model routing: Each agent uses the session's model unless the script routes to a different one
Flagship demo: Jarred Sumner rewrote Bun from Zig to Rust — ~750,000 lines in 11 days, 99.8% tests green.
Our Factory Pattern¶
The Kelly Factory is an orchestration layer built on OpenClaw, designed around several distinct mechanisms:
Subagent Spawning: spawn-subphase.sh reads workflow YAML frontmatter to determine agent, phase, and project path. Each subagent runs in isolation with its own context window, tool policies, and session. The script is the single source of truth for who does what — never hardcoded.
Beads (Issue Tracking): A Dolt-powered, git-versioned SQL queryable substrate. Every subphase creates a Bead with reason, who, when, why. Beads track dependencies, blocking relationships, and pipeline progress. bd ready surfaces only unblocked steps. bd pour instantiates full molecules (pipelines as dependency graphs).
Chain Protocol (Auto-Spawn): When a subagent completes, sessions_yield pushes a completion event back to the Router. The Router closes the bead, reads the formula TOML for the next step, runs spawn-subphase.sh --auto-spawn, and spawns the next subagent. This is a push-based, event-driven chain — no polling, no timers, no single point of failure.
DONE Markers: Dual-write discipline — both the DONE file (filesystem) and bead closure (Dolt) must fire. Subagents write DONE; Router verifies and closes the bead. This creates an audit trail and prevents phantom completions.
Quality Gates: TEA 3-phase audit, 5-agent adversarial verdict, human SHIP/NO-SHIP gate. RALPH protocol: 3 retries per task, then escalate with diagnostics. Never spin on the same error twice.
Router as Refinery: The Router is an orchestrator, not a worker. It delegates, chains, validates, and escalates. It never edits project files while a sub-agent is working on the same project.
Structural Comparison¶
Where They Converge¶
| Dimension | Anthropic Dynamic Workflows | Kelly Factory |
|---|---|---|
| Core idea | Script-driven orchestration of parallel subagents | Formula/TOML-driven orchestration of sequential subagents |
| Context isolation | Intermediate results in script variables, not context window | Each subagent runs in its own session, isolated context |
| Scale ambition | Up to 1,000 agents | Current pipeline ~30 steps; scales horizontally |
| Verification | "Verifies outputs before reporting back" | TEA audit + adversarial verdict + human gate |
| Delegation | Claude writes the plan, runtime executes | Router spawns subagents, never does the work |
Where They Diverge¶
| Dimension | Anthropic Dynamic Workflows | Kelly Factory |
|---|---|---|
| Orchestration substrate | JavaScript script (ad-hoc, generated per-task) | Formula TOML + workflow MD (persistent, version-controlled) |
| State management | Script variables (ephemeral) | Beads (Dolt-powered, git-versioned, SQL-queryable) |
| Why tracking | Implicit in script logic | Explicit reason field on every Bead — the "why" is first-class |
| Pipeline model | Parallel fleet, fan-out/fan-in | Sequential chain (parallelism within stage is open question) |
| Resumability | Built-in (cached results persist) | Manual (bead state preserved, but resume logic is ad-hoc) |
| Dependency awareness | Script logic manages ordering | bd dep add — explicit dependency graph with bd ready surfacing |
| Failure handling | Implicit (script can retry, user intervenes) | RALPH protocol (3 retries → escalate), Deacon patrol for stale beads |
| Human-in-the-loop | Permission prompts only | SHIP/NO-SHIP gate, operator queue, hold/continue commands |
| Repeatability | Save script → re-run slash command | Formula TOML + workflow files → re-spawn via spawn-subphase.sh |
| Audit trail | Implicit (script execution log) | Beads + DONE markers + log.md + TEA narrative |
What They Do Better¶
-
Scale. 1,000 concurrent agents vs our sequential chain. For repo-wide audits, migrations, and codebase-scale refactoring, their fan-out model is dramatically faster.
-
Resumability. Their scripts persist intermediate state natively. If a workflow is interrupted, it resumes from cached results. Our bead state persists, but resume logic requires manual intervention or Router detection.
-
Simplicity of activation. Say "workflow" or
/effort ultracodeand Claude decides what to do. Our pipeline requires explicit formula TOML, workflow MD files, andspawn-subphase.shinvocations. -
Parallelism within execution. Their 16-concurrent-agent model means a single workflow can do in minutes what takes us hours of sequential subagent chaining.
What We Do Better¶
-
Why-first design. Every Bead has a
reasonfield. Anthropic's scripts capture intent implicitly in code logic; we capture it explicitly as data. This is the Gas Town insight: git stores What/Where/Who/How; Beads store Why. Ourreasonfield is queryable, auditable, and survives context window resets. -
Dependency graph.
bd pourinstantiates molecules with explicit dependency wiring.bd readysurfaces only unblocked steps. Their script logic handles ordering, but it's procedural, not declarative. You can't query "what's blocking step 7?" without reading the script. -
Failure handling depth. RALPH (3 retries → escalate), Deacon patrol (stuck bead detection), RPH (timeout enforcement), operator queue (human escalation). Their failure mode is "user intervenes." Our failure mode is a multi-layered automated recovery system.
-
Audit trail. Beads + DONE markers + log.md + TEA narrative. Every decision, every spawn, every gate result is recorded with intent. Their execution is opaque once the script runs.
-
Human gates. SHIP/NO-SHIP, operator queue, hold/continue. They have permission prompts; we have structured human decision points embedded in the pipeline.
-
Separation of concerns. Router orchestrates, subagents execute, beads track, gates validate. Each mechanism has a single responsibility. Their script is a monolith — orchestration, execution, and verification all in one JS file.
What We Should Borrow¶
-
Native resumability. Their cached-results-and-resume pattern is compelling. When a subagent dies, we currently detect via Deacon patrol and re-spawn from scratch. We could persist intermediate results in beads and resume from the last checkpoint rather than restarting.
-
Parallel fan-out within stages. Our pipeline is sequential between stages. But within a stage, we could spawn multiple subagents in parallel (e.g., "check all 50 endpoints for auth" — each endpoint is an independent task). PIP-68's approach-A pattern (
sessions_yieldon completion) already supports this; we just need to spawn multiple subagents before yielding. -
Simpler activation. Our pipeline setup requires reading workflow MD files, understanding formula TOML, and knowing which pipeline to use. A "just say workflow" heuristic that auto-selects the right pipeline based on project state would reduce friction.
-
Model routing per agent. Their scripts can route different agents to different models. We're locked to the session's default model. For cost optimization (cheap model for simple tasks, expensive for complex ones), per-agent model selection would help.
Key Gaps We Implicitly Solve¶
Anthropic's dynamic workflows leave several problems unsolved that our pattern addresses:
-
No persistent state substrate. Script variables are ephemeral. Beads are git-versioned and SQL-queryable. Our state survives context window resets, model changes, and session restarts.
-
No intent capture. Scripts capture what the agent should do, not why. Our
reasonfield is the atomic unit of intent tracking. Over time, this becomes an invaluable audit trail for understanding past decisions. -
No dependency-aware scheduling. Scripts handle ordering procedurally. Our
bd ready+ dependency graph is declarative — add a dependency, and the scheduler automatically adjusts. No code changes needed. -
No structured failure recovery. RALPH + Deacon + RPH is a complete failure-handling stack. Their model is "user intervenes or the script retries." For unattended operation, our approach is more robust.
-
No human decision gates. Permission prompts are binary (approve/deny). SHIP/NO-SHIP gates are structured decision points with context, alternatives, and consequences. For high-stakes work, our gates provide more granular control.
Synthesis¶
The two patterns solve the same fundamental problem — coordinating multiple AI agents to accomplish complex tasks — from opposite directions:
-
Anthropic starts with the agent and builds orchestration around it. The agent generates the plan, the runtime executes it. This is powerful for ad-hoc, one-off workflows where the agent knows best.
-
We start with the pipeline and build agents around it. The formula TOML defines the structure, the Router executes it. This is powerful for repeatable, auditable workflows where structure matters more than flexibility.
The ideal hybrid would combine Anthropic's parallel fan-out and resumability with our dependency graph, intent capture, and failure handling. The result: a pipeline that can fan out to hundreds of parallel agents, persist intermediate state, resume from checkpoints, track intent on every step, and automatically recover from failures — all with structured human gates at critical decision points.
Bibliography¶
- Anthropic (May 28, 2026). Introducing dynamic workflows in Claude Code [Blog post]. https://www.anthropic.com
- Anthropic (May 28, 2026). Claude Opus 4.8 [Announcement]. https://www.anthropic.com
- PIP-49 (Beads adoption) — internal KB reference
- PIP-68 (Auto-spawn chain) — internal KB reference
Related¶
- factory-v3 — Our Factory pipeline architecture
- full-pipeline — Full dark-factory pipeline structure
- multi-agent-pipeline — Sequential subagent orchestration
- sub-agent-parallelism — Parallel subagent fleets (what Anthropic scales to 1,000)
- subagent-spawning —
spawn-subphase.shmechanism that reads workflow frontmatter - workflows — The workflows concept
- pipeline-with-gates — Pipeline model with quality gates
- event-driven — Push-based event-driven chain (sessions_yield)
- quality-gates — TEA, adversarial, SHIP/NO-SHIP gates
- ralph-protocol — 3 retries → escalate failure handling
- ship-or-no-ship — Human SHIP/NO-SHIP decision gate
- gas-town-daemons/gas-town-daemon-architecture — Gas Town daemon model (source of Why-first design)
- agents/kelly-router — Router as refinery, never a worker
Sources: Anthropic blog post "Introducing dynamic workflows in Claude Code" (May 28, 2026), Anthropic Opus 4.8 announcement, PIP-49 (Beads adoption), PIP-68 (Auto-spawn chain), AGENTS.md (Router section), existing KB article claude-code-dynamic-workflows.md.