Story-by-Story Build — Context-Bounded Implementation¶

Type: Operational pattern
Related: kelly-handbook-ch7-multi-agent, kelly-handbook-ch8-memory

The Problem: Context Overflow in Long BUILD Phases¶

When a BUILD phase has 20+ stories, handing the agent the entire sprint backlog in a single session doesn't work. The agent's context window fills up with:

The full task description (all stories)
Every file it reads to understand the codebase
Every file it writes or edits
Tool call history (exec output, file contents)
Its own reasoning

By story 5 or 6, the agent is spending most of its context re-reading files it already read. By story 10, it starts looping — reading the same files repeatedly without producing output. By story 15, it may hallucinate or produce duplicate code.

The Solution: One Story Per Agent Session¶

The factory spawns one story at a time. Each story gets its own subagent session with:

Bounded context — only the files relevant to that story
Fresh state — no accumulated tool history from previous stories
Clear scope — the agent knows exactly what to build

The Pattern¶

For each story in the sprint backlog:
    1. spawn-subphase.sh <project> <step> <pipeline>
    2. Task description includes ONE story + relevant context files
    3. sessions_yield — wait for completion
    4. Verify DONE marker + bead closure
    5. Next story

Stories are sequential within a build step. The agent finishes story N before story N+1 starts.

What the Task Description Looks Like¶

<!-- BEAD: workspace-123 -->

## Task: Implement STORY-003 — User Authentication

**Project:** my-project
**Story:** implementation-artifacts/stories/STORY-003-auth-login.md
**Context files:**
- implementation-artifacts/stories/STORY-001-models.md (completed — reference only)
- implementation-artifacts/SPEC.md

**Requirements:**
- Implement login/logout endpoints
- Add session middleware
- Write tests for auth flow

**Output:** Code in implementation-artifacts/, tests passing

The agent gets ONE story, the minimum context needed, and a clear output expectation.

The Anti-Pattern: Multi-Story Spawns¶

Spawning multiple stories in one session is the most common cause of build failures in the factory.

What Happens¶

Agent receives 5 stories in one task description
Reads all relevant files for story 1 — context used: 20%
Implements story 1 — context used: 35%
Reads files for story 2 — context used: 50%
Implements story 2 — context used: 65%
By story 3, context is tight — agent re-reads files it already loaded
By story 4, agent is looping — reading models.py for the third time
Story 5 never gets written

Real Example: Sally's Multi-Storyboard Failure¶

Sally (the UI/UX agent) was given a multi-storyboard spawn — implement 9 storyboard frames in one session. The result:

Frame 1-3: implemented correctly
Frame 4-6: implemented with copy-paste errors from frames 1-3
Frame 7-9: never produced — Sally looped reading the spec file

The fix: spawn Sally once per storyboard frame. All 9 frames completed successfully in separate sessions.

Real Example: Amelia's Success with Separate Sessions¶

Amelia (the build agent) completed 24+ stories across test-web-run and factory-dashboard-rebuild — all in separate sessions. Each session produced clean, working code. No loops, no context overflow, no duplicate work.

Context Window Budget¶

The practical limit depends on the model, but the factory's rule of thumb:

Stories per Session	Outcome
1	Reliable — agent finishes with context to spare
2-3	Usually works — but watch for re-reads
4-5	Risky — agent may loop on later stories
6+	Almost certain to fail — context overflow

For complex stories (many files, large codebase), even 2 stories may be too many. When in doubt, use 1.

Implementation Details¶

Step Ordering¶

Stories are assigned to steps in the pipeline formula. Each build step maps to one or more stories:

4.1 → Sprint Planning (assigns stories to steps)
4.2 → STORY-001 implementation
4.3 → STORY-002 implementation
4.4 → STORY-003 implementation

The auto-spawn-chain handles chaining: 4.2 completes → 4.3 spawns → 4.4 spawns → done.

Context Files¶

Each story's task description includes only the context files it needs:
- The story file itself
- The SPEC (if it exists)
- Previously completed stories that this story depends on
- Relevant design artifacts

The agent does not get the full backlog, other stories' files, or unrelated artifacts.

Verification¶

After each story:
1. DONE marker exists at the expected path
2. Bead is closed
3. Code compiles / tests pass (if QA step follows)

If verification fails, RALPH handles the retry — but only for that one story, not the entire backlog.

When to Break This Rule¶

One story per session is the default. Exceptions:

Trivial stories — if two stories are 5-line changes with no overlap, they can share a session. But this is rare.
Story dependencies — if story B depends on story A's exact implementation, running them in the same session ensures consistency. But the better fix is to make story B's task description include story A's output as context.
Token budget pressure — if running many small stories and token cost is a concern, batching 2-3 may save money. But watch for the looping signs.

The Core Lesson¶

Context windows are finite. The factory's job is to bound the work to match. One story per session is the simplest bound — and the most reliable.

The pattern scales: 20 stories = 20 sessions = 20 fresh context windows. Each one reliable. The auto-spawn-chain makes this hands-free.

This pattern was validated in April 2026 when Sally's multi-storyboard spawn failed and Amelia's single-story spawns succeeded across 24+ stories. The lesson was immediate and clear: bound the work, bound the context.

kelly-handbook-ch7-multi-agent — context management
kelly-handbook-ch8-memory — memory and context windows

Concept Cross-References¶

memory/context-window — Context window overflow mechanics and bounds
multi-agent-pipeline — Multi-agent orchestration that this pattern enables
sub-agent-parallelism — Sub-agent parallelism for batched story execution
agents/kelly-router — Router spawns one-story-per-session chains
memory/five-layer-memory — Memory layers that survive context resets between stories

Concept Links¶

world-model — story-by-story builds maintain bounded context windows, which is the memory management equivalent of world-model: keeping the cognitive substrate small and demand-loaded
lobster-pipelines — the sequential one-story-per-session pattern mirrors lobster-pipelines: each story is a checkpointed, verifiable unit that feeds into the next step

Based on operational experience building upwork-online-school (24+ stories), upwork-rv-chatbot, and camping-checklist pipelines. Pattern validated across Amelia (build agent), Sally (design agent), and Carson (research agent).