← Back to KB Index
Kelly vs Gas Town — Full Gap Analysis
kelly-gas-town-gap-analysis.md
idkelly-gas-town-gap-analysis
typearticle
sourcekelly-gas-town-gap-analysis
authorCarson (dark-factory-kb subagent)
date2026-04-26

Kelly vs Gas Town — Full Gap Analysis

Assessor: Carson (dark-factory-kb subagent)

Date: 2026-04-26

Sources: Kelly handbook (Ch7 multi-agent, Ch11 factory), Kelly tweets, [[steve-yegge-gas-town]], [[steve-yegge-beads]], [[steve-yegge-gupp]], [[steve-yegge-meow]], [[steve-yegge-hierarchy]], [[steve-yegge-wasteland]], [[steve-yegge-gas-city]], [[steve-yegge-saas-mountain]], [[steve-yegge-beads-kelly-gap]]

1. Beads

vs pipeline state, done markers, TEA audits, memory

Kelly Equivalent

Kelly tracks work state across four separate mechanisms:

A prior assessment ([[steve-yegge-beads-kelly-gap]]) evaluated each mechanism against Beads. The summary finding: done markers map naturally to Bead state transitions (adopt); TEA audit is the strongest semantic match (adopt with schema design); pipeline state needs a JSON view on top of Beads (partial adopt); memory requires the most careful human-interface design (partial adopt).

Gap?

Partial. Kelly has file-based equivalents for all Bead functions, but none of them are unified. Beads would replace four separate mechanisms with a single git-versioned, SQL-queryable substrate. The semantic coverage is there; the architectural integration is not.

What Gas Town Does Better

What Kelly Does Better

Adoption Potential

High. Beads are the most impactful single architectural upgrade to the Kelly pipeline. The TEA audit and done markers are the immediate adoption targets; pipeline state can follow once the Beads substrate is stable. The main risk is Dolt infrastructure — mitigate by deploying Dolt as a local instance first, not a hosted service.

3. MEOW

vs Kelly's knowledge graph / structured memory

Kelly Equivalent

Kelly has two knowledge management layers:

These are narrative, text-based, and human-written. They are human-readable and human-writable, but not machine-queryable in any structured sense. The closest Kelly gets to structured knowledge is the TEA audit narrative and the pipeline state's machine-readable stage metadata.

Gap?

Full. Kelly has no knowledge graph. memory and daily logs are a well-written journal; MEOW's Dolt-backed graph is a structured database. The difference is architectural: one preserves knowledge in narrative form (rich but opaque to queries), the other in typed, queryable form (structured and powerful but less narrative).

What Gas Town Does Better

What Kelly Does Better

Adoption Potential

Medium. MEOW is a powerful long-term target for the operator's Kelly factory, but it's a larger migration than Beads for state tracking. The recommendation from the prior Beads assessment holds: adopt Beads as the TEA audit substrate first (where the semantic match is strongest), then extend to memory as the interface matures. Full MEOW knowledge graph adoption requires Dolt infrastructure + a human-usable write interface + a migration strategy for existing memory files. High value, but phased investment.

5. Deacon / Boot / Witness / Refinery

vs heartbeat, subagent spawning, QA, routing

Kelly Equivalent

Kelly distributes these functions across existing mechanisms:

Kelly does not have dedicated daemon roles corresponding to Deacon, Boot, Witness, or Refinery. These functions are either handled by agents (TEA audit by test-lead) or not explicitly handled (Deacon's stuck-worker cleanup, Boot's heartbeat offload).

Gap?

Partial. Kelly has equivalents for the outcomes of these roles (liveness detection, quality gates, routing) but not for the architectural separation of these functions into dedicated daemons. The gap is in isolation, observability, and enforcement.

What Gas Town Does Better

What Kelly Does Better

Adoption Potential

Medium. The highest-value adoption from this group is the Deacon's structural liveness enforcement — replacing Kelly's file-based heartbeat with a daemon that actively patrols running agents. This is a targeted addition that doesn't require adopting the full Gas Town daemon model. Witness (continuous quality watching) is valuable but more complex to integrate. Refinery's lightweight intent decomposition could enhance Kelly's Intake/Planning stages. TEA and the 5-agent verdict already give Kelly strong quality coverage.

7. Gas City SDK

vs Kelly's pipeline-as-framework

Kelly Equivalent

Kelly's factory is a structured six-stage pipeline:

This is a pipeline, not an SDK. It is project-specific and opinionated — the six stages are the factory. Gas City is a toolkit for building arbitrary factories; Kelly is a specific factory implementation.

Gap?

Partial. Kelly doesn't have a Gas City equivalent — a composable SDK of building blocks that can be assembled into custom factories. However, Kelly's BMAD agent definitions ([[kelly-tweets-bmad]]) provide a degree of composability: modular agent specs that can be mixed and matched. The gap is in the pack model and the SDK extensibility.

What Gas Town Does Better

What Kelly Does Better

Adoption Potential

Medium. the operator should consider Gas City's pack composition principles — specifically the idea that agent roles, workflows, and Bead types should be composable modules rather than a fixed pipeline. This could inspire a Kelly "factory SDK" that separates the pipeline framework from project-specific configurations. However, full Gas City SDK adoption would require a significant architectural shift from Kelly's pipeline model to Gas City's pack model. The more practical near-term adoption is applying Gas City's Light Factory observability principles to make Kelly's pipeline more transparent.

9. Multi-Agent Adversarial Reliability

vs Kelly's TEA audit / QA gates

Kelly Equivalent

Kelly has two adversarial reliability mechanisms:

Kelly's multi-agent review is well-established. The TEA audit is a formal three-phase gate; the Angry Mob is used for adversarial testing (e.g., 5 agents independently test the same implementation and compare results).

Gap?

None. Kelly has genuine adversarial multi-agent review. The 5-agent verdict is arguably more robust than Gas Town's single Witness — five agents independently reaching consensus is statistically more reliable than a single quality auditor.

What Gas Town Does Better

What Kelly Does Better

Adoption Potential

Low. Kelly already has stronger adversarial coverage than Gas Town (5-agent verdict > single Witness) and a more formal quality audit structure (TEA's three phases). The only improvement Kelly should consider from Gas Town is making the Witness role continuous rather than a stage-gate — adding lightweight, continuous quality watching alongside the existing TEA stage-gate.

Concepts Kelly is Missing Entirely

The following Gas Town concepts have no Kelly equivalent at all:

GUPP (Full Execution Persistence)

Kelly's agents yield and resume. GUPP's "if your hook is non-empty, you MUST run" has no architectural equivalent. A Kelly agent that yields indefinitely will stall the pipeline; GUPP eliminates this by design. Recommendation: Add explicit timeout enforcement on sub-agent spawning with automatic re-spawn. This is a lightweight approximation of GUPP that doesn't require the full hook infrastructure.

Wasteland (Federated Reputation Economy)

Kelly's autonomous company marketplace is a design concept without a concrete protocol implementation. The Wasteland's git/Dolt-backed Wanted Board, multi-dimensional stamps, and trust ladder are the concrete realization. Recommendation: Not immediately applicable (requires network effects), but study the Wasteland's protocol design for future multi-factory scenarios.

Beads-as-Why (The Missing Why as First-Class Data)

Beads' core insight — git stores What/Where/Who/How, Beads capture Why — has no Kelly equivalent. Kelly's TEA audit captures reasoning, but as a narrative sidecar, not as the primary work primitive. Recommendation: Begin TEA schema design to map current audit fields to Bead fields. This is the highest-value missing concept to adopt.

MEOW Work Primitives (Work as First-Class System Primitive)

In MEOW, everything is Work: knowledge, coordination, communication, reputation. Kelly distinguishes between work items (pipeline tasks), memory (knowledge), and quality gates (coordination) — they are separate mechanisms. Recommendation: Adopt Beads as the universal substrate so that all Kelly mechanisms (pipeline state, memory, QA, heartbeats) can eventually be queried together.

Boot Daemon (Heartbeat Traffic Isolation)

Boot handles heartbeats so the Deacon isn't interrupted. Kelly has no equivalent heartbeat isolation — heartbeat traffic is distributed across all agents and competes with work processing. Recommendation: Consider a dedicated heartbeat handler for high-agent-count deployments.

Pack Composition Model

Gas City's packs (composable agent role + workflow + Bead type bundles) have no Kelly equivalent. Kelly's factory is a fixed six-stage pipeline. Recommendation: Investigate whether Kelly's BMAD agent definitions could be extended to a pack-like composability model for future extensibility.

Combined Architecture

What would a Kelly + Gas Town hybrid look like? The best elements from each:

Pipeline Layer: Kelly's Stages + Gas Town's Packs

The hybrid retains Kelly's explicit six-stage pipeline (Intake → Research → Planning → Implementation → Testing → Release) as the macro structure, but replaces the fixed stage implementation with Gas City's pack composition model. Each pipeline stage is a pack containing the agent roles, workflows, and Bead types specific to that stage. Stages compose via the shared Bead substrate — a Bead created in Planning flows naturally into Implementation.

Execution Layer: Kelly's RALPH + Gas Town's GUPP

Sub-agents execute under GUPP's hook model (if work on hook, must run) with RALPH's retry-with-diagnostics layered on top. When a sub-agent fails, RALPH's rules apply: 3 retries, pass diagnostics, escalate on same error twice. The Deacon patrols hooks and re-queues stale Beads. This is Kelly's reliability + Gas Town's throughput in one execution model.

Role Layer: Gas Town's Mayor + Kelly's Router

The Mayor is the human's primary interface — filtering all agent output, surfacing decisions, maintaining context. The Kelly Router handles pipeline orchestration (staging, gate validation, escalation). These are complementary roles: Mayor is the human's face; Router is the factory's engine. They communicate via Beads.

Memory Layer: Kelly's TEA + Gas Town's MEOW

TEA audits become Beads with MEOW's graph structure. Each TEA audit item is a Bead node with typed edges to the work item it audited and the decision it reached. The TEA's reason field IS the Bead's reason field. The full knowledge graph is queryable: "Show me all TEA decisions related to security in the last quarter." Kelly's narrative TEA richness is preserved via a notes text field on each Bead.

Quality Layer: Kelly's 5-Agent Verdict + Gas Town's Witness

Witness runs continuously (not just at stage gates) catching simple quality issues early. 5-agent verdict applies at TEA stage for high-stakes decisions. This gives the hybrid continuous lightweight watching plus thorough adversarial review for release-critical quality gates.

Observability Layer: Gas Town's Light Factory + Kelly's Files

Deploy Dolt as the Beads backing store. All pipeline state, TEA audits, done markers, and quality results are Beads on the same Dolt instance. Kelly's file-based interface (memory, done markers, pipeline state) is preserved as view layers generated from Dolt queries — humans read files, machines query Dolt. This is the Light Factory: all workers visible and addressable, with the file interface as the human window.

Summary Table

ConceptKelly StatusGapAdoption Priority
BeadsPartial (4 separate mechanisms)Partial**High**
GUPPNone (yield-friendly model)Full**Medium-High**
MEOW / Knowledge GraphNone (text-based memory)Full**Medium**
Mayor (information filtering)None (router is routing-focused)Full**High**
Crew (named persistent agents)Partial (named lead agents exist)Partial**Medium**
Polecats (ephemeral workers)Full (sub-agents)None**N/A**
Deacon (stuck worker cleanup)Partial (heartbeat check)Partial**Medium**
Boot (heartbeat offload)NoneFull**Low**
Witness (continuous QA)Partial (TEA stage-gate)Partial**Medium**
Refinery (intent decomposition)Partial (Intake → Planning)Partial**Medium**
Wasteland (federated reputation)None (conceptual only)Full**Low (near-term)**
Gas City SDK / PacksNone (fixed pipeline)Full**Medium (long-term)**
SaaS Mountain narrativePartial (implicit)Partial**N/A**
5-agent adversarial verdictFull (Angry Mob / 5-agent)None**N/A**
TEA audit (3-phase gate)Full (TEA in pipeline)None**N/A**
Human-in-the-loop (SHIP/NO-SHIP)Full (operator decision)None**N/A**
RALPH (retry + escalate)Full (RALPH protocol)None**N/A**
Light Factory / ObservabilityPartial (file-based)Partial**Medium**
AGENTS.md as executable specFull (AGENTS.md pattern)None**N/A**
Structured handoff (artifact dirs)Full (summary.md gates)None**N/A**