← Back to KB Index
Kelly Factory — Overview
kelly-factory-overview.md
idkelly-factory-overview
typearticle
sourcekelly-factory-overview
authorrouter (compiled from Kelly handbook, tweets, and gap analysis)
date2026-04-27

Kelly Factory — Overview

Compiled by: Router (subagent)

Date: 2026-04-27

Sources: Kelly Handbook Ch7 (Multi-Agent), Kelly Handbook Ch11 (Software Factory), Kelly Tweets (Factory), Kelly vs Gas Town Gap Analysis, soul, AGENTS.md

Architecture

Pipeline Stages

The factory runs a structured six-stage pipeline:

Idea → Intake → Research → Planning → Implementation → Testing → Release                (CIS Loop)  (PRD+    (Sprint      (TEA     (Operator                           Arch+     Execution)   Audit)   Decision)                           UX Design)
StageAgentKey ArtifactGate
IntakeRouter`intake.md`
Researchresearch-lead`research-summary.md`READY / NOT-READY
Planningproject-lead`prd.md`, `architecture.md`, `ux-design.md`PASS / FAIL
Implementationbuild agents`implementation-summary.md`
Testingtest-lead`tea-summary.md`PASS / PASS-WITH-FOLLOWUPS / REMEDIATE
ReleaseRouterSHIP / NO-SHIP

Quick Path: Bug fixes skip Research and go directly to Planning/Implementation. New products run the full pipeline.

Full Pipeline example (CSV Export Feature): 10:15 AM intake → 10:21 AM planning complete → 11:15 AM implementation done → 11:30 AM TEA audit PASS-WITH-FOLLOWUPS → 11:31 AM release decision. ~75 minutes end-to-end.

Sub-Agent Routing Table

Task TypeRoute To
New project intakecarson
Researchcarson, drquinn, mary, victor
Planningmary
Architecturewinston
Design/UIsally
Buildingamelia
Testing/QAqa, testlead
DevOps/Scaffold/Deployphil

BEADS Pipeline State

Kelly currently tracks pipeline state across four separate mechanisms:

The Kelly vs Gas Town gap analysis identifies Beads (git-versioned, SQL-queryable work primitives backed by Dolt) as the highest-priority migration target to unify these four separate mechanisms into a single substrate.

Memory Model

5-Layer System

Kelly's session persistence uses a 5-layer memory system — demand-loaded, narrative-dominant:

LayerFilePurpose
1soulWho Kelly is — identity, role, communication style
2memoryCurated long-term memory — learnings, decisions, insights
3memory/YYYY-MM-DD.mdToday's raw session log
4projects/{id}/context.mdPer-project state and context
5data/.jsonStructured data, lookups

Design principle: Narrative at the top (layers 2–3), structured at the bottom (layers 4–5). Yuki AI CEO experiments confirm this is correct — LLM recall is associative, not indexed. Tables are for lookup; narrative is for association. More knowledge can compound without expanding the attention footprint via progressive disclosure.

Load strategy: soul and memory load every session (small map). Project context and daily logs load on demand (large encyclopedia). This is the "map, not encyclopedia" principle — validated empirically by Yuki's CLAUDE.md shrinking 36% while repo doubled in size.

Autonomy Model

sessions_yield for Sub-Agent Execution

The Router uses sessions_yield to delegate work to sub-agents:

RALPH Retry Protocol

Retry And Learn Protocol:

  1. Any sub-agent failure → retry
  2. Same failure twice → escalate immediately (don't waste third attempt)
  3. Three failures → mandatory escalation with structured diagnostic
  4. Unrecoverable → immediate escalation with operator decision requested

Cron for Scheduling

Scheduled automations (health checks, periodic updates, daily syncs) run via cron — tasks that need no reasoning and run on a timer. sessions_yield vs cron separation principle: tasks that need reasoning belong in agent sessions; tasks that don't belong in cron/scheduled automation. Mixing them makes both worse. Yuki AI CEO confirmed this separation with Romain's correction: "New tool doesn't mean move everything there."

heartbeat for Active Pulse

Kelly's heartbeat mechanism: agents periodically update heartbeat with current activity and timestamp. Detects stuck agents by absence of updates. This is a file-based approximation of Gas Town's Deacon daemon (which actively patrols hooks structurally).

Gaps vs Gas Town

The Kelly vs Gas Town gap analysis (Carson, 2026-04-26) identified these missing elements:

Missing Beads Unified Substrate

Kelly has 4 separate state mechanisms (pipeline state, done markers, TEA audit, heartbeat) that Beads would unify into a single git-versioned, SQL-queryable substrate. Adoption priority: High.

Missing GUPP Hook Enforcement

Kelly's yield-friendly model has no equivalent to GUPP's absolute "if hook is non-empty, you MUST run" rule. No architectural enforcement that a deferred sub-agent is violating a contract. Adoption priority: Medium-High.

Missing Autonomous Compounding Loops

Kelly's cron/TaskFlow handle scheduled tasks but none read their own prior outputs and compound. Yuki AI CEO's three production loops (New AI Models, Bug Autofix, SEO Optimizer) demonstrate the pattern: each reads its last run's output, takes the next step, and writes new output for next time. Adoption priority: Medium.

Missing Explicit Authority Matrix

Kelly's authority is gate-driven (PASS/FAIL before advancing) but has no written per-agent authority tiers with progressive transfer tracking. Gas Town's Mayor/Crew/Polecats and Yuki's three-tier authority matrix provide the model. Adoption priority: High.

Missing 30-Day Outcome Reviews

Kelly's quality gates are point-in-time (gate at stage transition). Yuki AI CEO adds temporal quality gates: every significant decision sets a 30-day review date, then assesses actual vs expected outcome. Adoption priority: Medium.

Related