Story-by-Story Build — Context-Bounded Implementation¶
Type: Operational pattern
Related: kelly-handbook-ch7-multi-agent, kelly-handbook-ch8-memory
The Problem: Context Overflow in Long BUILD Phases¶
When a BUILD phase has 20+ stories, handing the agent the entire sprint backlog in a single session doesn't work. The agent's context window fills up with:
- The full task description (all stories)
- Every file it reads to understand the codebase
- Every file it writes or edits
- Tool call history (exec output, file contents)
- Its own reasoning
By story 5 or 6, the agent is spending most of its context re-reading files it already read. By story 10, it starts looping — reading the same files repeatedly without producing output. By story 15, it may hallucinate or produce duplicate code.
The Solution: One Story Per Agent Session¶
The factory spawns one story at a time. Each story gets its own subagent session with:
- Bounded context — only the files relevant to that story
- Fresh state — no accumulated tool history from previous stories
- Clear scope — the agent knows exactly what to build
The Pattern¶
For each story in the sprint backlog:
1. spawn-subphase.sh <project> <step> <pipeline>
2. Task description includes ONE story + relevant context files
3. sessions_yield — wait for completion
4. Verify DONE marker + bead closure
5. Next story
Stories are sequential within a build step. The agent finishes story N before story N+1 starts.
What the Task Description Looks Like¶
<!-- BEAD: workspace-123 -->
## Task: Implement STORY-003 — User Authentication
**Project:** my-project
**Story:** implementation-artifacts/stories/STORY-003-auth-login.md
**Context files:**
- implementation-artifacts/stories/STORY-001-models.md (completed — reference only)
- implementation-artifacts/SPEC.md
**Requirements:**
- Implement login/logout endpoints
- Add session middleware
- Write tests for auth flow
**Output:** Code in implementation-artifacts/, tests passing
The agent gets ONE story, the minimum context needed, and a clear output expectation.
The Anti-Pattern: Multi-Story Spawns¶
Spawning multiple stories in one session is the most common cause of build failures in the factory.
What Happens¶
- Agent receives 5 stories in one task description
- Reads all relevant files for story 1 — context used: 20%
- Implements story 1 — context used: 35%
- Reads files for story 2 — context used: 50%
- Implements story 2 — context used: 65%
- By story 3, context is tight — agent re-reads files it already loaded
- By story 4, agent is looping — reading
models.pyfor the third time - Story 5 never gets written
Real Example: Sally's Multi-Storyboard Failure¶
Sally (the UI/UX agent) was given a multi-storyboard spawn — implement 9 storyboard frames in one session. The result:
- Frame 1-3: implemented correctly
- Frame 4-6: implemented with copy-paste errors from frames 1-3
- Frame 7-9: never produced — Sally looped reading the spec file
The fix: spawn Sally once per storyboard frame. All 9 frames completed successfully in separate sessions.
Real Example: Amelia's Success with Separate Sessions¶
Amelia (the build agent) completed 24+ stories across test-web-run and factory-dashboard-rebuild — all in separate sessions. Each session produced clean, working code. No loops, no context overflow, no duplicate work.
Context Window Budget¶
The practical limit depends on the model, but the factory's rule of thumb:
| Stories per Session | Outcome |
|---|---|
| 1 | Reliable — agent finishes with context to spare |
| 2-3 | Usually works — but watch for re-reads |
| 4-5 | Risky — agent may loop on later stories |
| 6+ | Almost certain to fail — context overflow |
For complex stories (many files, large codebase), even 2 stories may be too many. When in doubt, use 1.
Implementation Details¶
Step Ordering¶
Stories are assigned to steps in the pipeline formula. Each build step maps to one or more stories:
4.1 → Sprint Planning (assigns stories to steps)
4.2 → STORY-001 implementation
4.3 → STORY-002 implementation
4.4 → STORY-003 implementation
The auto-spawn-chain handles chaining: 4.2 completes → 4.3 spawns → 4.4 spawns → done.
Context Files¶
Each story's task description includes only the context files it needs:
- The story file itself
- The SPEC (if it exists)
- Previously completed stories that this story depends on
- Relevant design artifacts
The agent does not get the full backlog, other stories' files, or unrelated artifacts.
Verification¶
After each story:
1. DONE marker exists at the expected path
2. Bead is closed
3. Code compiles / tests pass (if QA step follows)
If verification fails, RALPH handles the retry — but only for that one story, not the entire backlog.
When to Break This Rule¶
One story per session is the default. Exceptions:
- Trivial stories — if two stories are 5-line changes with no overlap, they can share a session. But this is rare.
- Story dependencies — if story B depends on story A's exact implementation, running them in the same session ensures consistency. But the better fix is to make story B's task description include story A's output as context.
- Token budget pressure — if running many small stories and token cost is a concern, batching 2-3 may save money. But watch for the looping signs.
The Core Lesson¶
Context windows are finite. The factory's job is to bound the work to match. One story per session is the simplest bound — and the most reliable.
The pattern scales: 20 stories = 20 sessions = 20 fresh context windows. Each one reliable. The auto-spawn-chain makes this hands-free.
This pattern was validated in April 2026 when Sally's multi-storyboard spawn failed and Amelia's single-story spawns succeeded across 24+ stories. The lesson was immediate and clear: bound the work, bound the context.
Related¶
- kelly-handbook-ch7-multi-agent — context management
- kelly-handbook-ch8-memory — memory and context windows
Concept Cross-References¶
- memory/context-window — Context window overflow mechanics and bounds
- multi-agent-pipeline — Multi-agent orchestration that this pattern enables
- sub-agent-parallelism — Sub-agent parallelism for batched story execution
- agents/kelly-router — Router spawns one-story-per-session chains
- memory/five-layer-memory — Memory layers that survive context resets between stories
Concept Links¶
- world-model — story-by-story builds maintain bounded context windows, which is the memory management equivalent of world-model: keeping the cognitive substrate small and demand-loaded
- lobster-pipelines — the sequential one-story-per-session pattern mirrors lobster-pipelines: each story is a checkpointed, verifiable unit that feeds into the next step
Based on operational experience building upwork-online-school (24+ stories), upwork-rv-chatbot, and camping-checklist pipelines. Pattern validated across Amelia (build agent), Sally (design agent), and Carson (research agent).