Sub-Agent Parallelism

Sub-agent parallelism is the central execution mechanic of the Kelly Factory — the ability to spawn multiple AI agents simultaneously so that independent work streams complete in parallel rather than sequentially. It is the primary mechanism through which the factory achieves speed that would be impossible with a single agent. "What takes 15 minutes sequentially takes 5 in parallel" — that ratio encapsulates the compounding value of parallel execution.

The pattern emerged naturally from multi-agent architecture. Once Kelly demonstrated that sub-agents could be spawned to handle specialized work, it became obvious that the same spawning mechanism could spawn multiple agents at once rather than one at a time. Instead of Agent A completing its work, then Agent B starting, both start simultaneously and report when done. The parent agent (the Router) manages the coordination without doing the work itself.

The operational model works as follows: the Router identifies work that can be decomposed into independent streams, spawns a sub-agent for each stream with full context for its specific task, waits for all agents to complete (via sessions_yield), then synthesizes the results into a coherent output. The agents are ephemeral — they run to completion and die. The Router maintains continuity across all of them.

The canonical example is iOS app building: spawning 4 parallel sub-agents to build 4 complete iOS apps in 15 minutes. Each agent handled one app end-to-end — SwiftUI views, data models, StoreKit 2 integration, privacy policies, metadata. Total output: 60 Swift files, 40,000+ words of documentation, all production-ready. A single agent working sequentially would have taken hours to produce the same output, if it could maintain context long enough without errors.

The empirical verdict on agent count came from comparative testing: 5 focused agents produce higher-quality artifacts than a single agent with full context. The reasoning is that context compartmentalization lets each agent go deep on its domain without the cognitive overhead of maintaining awareness of all other domains. A single agent juggling all phases has to switch contexts constantly; 5 agents each own their context completely.

Parallelism also applies to testing at scale. When 97 critical security issues were found across 4 iOS apps, 4 parallel sub-agents were spawned — one per app — and resolved 27 vulnerabilities in 8 minutes. The same work sequentially would have required 32 minutes of sequential agent time plus the overhead of context switching.

The key architectural requirements for effective parallelism are: stateless parent coordination (the Router doesn't do work, it coordinates), ephemeral sub-agents (each agent owns its context and dies when done), sessions_yield as the yield primitive (parent yields control while sub-agents run), and a state tracking mechanism so results can be synthesized when agents report back. Without these, parallelism degrades into chaos — agents stepping on each other, results lost, context fragmenting.

The speed gains from parallelism are fundamental to the factory's economics. At $0.50 per completed iOS app using model selection and caching, the factory is profitable because parallel execution compresses the time dimension. A sequential process that takes 4 hours can be accomplished in 1 hour of parallel execution, cutting compute costs by 75% while delivering the same output.


  • autonomous-builder — the identity that enables the Router to initiate parallel work
  • angry-mob — parallel testing agents that attack codebases simultaneously
  • kelly-router — the orchestrator that spawns and coordinates parallel sub-agents
  • ios-factory-pipeline — where parallelism is applied across the 10-step factory pipeline