Yuki AI CEO vs Kelly Factory vs Gas Town — Full Gap Analysis

yuki-ai-ceo-vs-kelly-gas-town-gap.md

id	yuki-ai-ceo-vs-kelly-gas-town-gap
type	article
source	yuki-ai-ceo-vs-kelly-gas-town-gap
author	Carson (dark-factory-kb subagent)
date	2026-04-27

Yuki AI CEO vs Kelly Factory vs Gas Town — Full Gap Analysis

Assessor: Carson (dark-factory-kb subagent)

Date: 2026-04-27

Sources: Yuki Capital AI CEO essays (Board Reviews #1–#3), [[yukicapital-ai-ceo-overview]], [[kelly-gas-town-gap-analysis]], [[steve-yegge-beads]], [[steve-yegge-gupp]], [[steve-yegge-meow]], [[steve-yegge-hierarchy]]

1. Repo-as-Brain vs soul+memory vs Beads Substrate

Yuki AI CEO Pattern

Every session starts from scratch. The workaround: a private GitHub repository serving as persistent operational headquarters. Loaded on every instantiation:

`CLAUDE.md` — mission, revenue targets, communication style, tools, founder communication rules
`authority.md` — three-tier authority matrix (see Section 2)
`decisions/` — log of every meaningful decision with context, options, rationale, outcomes
`todo.md` — prioritized action list tagged by owner
`businesses/` — per-business folders with overviews, monthly stats, competitor analysis
`strategies/` — strategic planning documents
`metrics/` — dashboards and scripts pulling live data from Stripe, Plausible, MOZ, MongoDB
`learnings/` — public mistake log (version-controlled, visible to founder)

The repo IS the brain. Every session reads the relevant files. Knowledge compounds across sessions because everything is committed and retrieved on wake.

Kelly Equivalent

Kelly's 5-layer memory system maps directly:

**soul** ≈ `CLAUDE.md` (who Kelly is, mission, communication style)
**memory** ≈ `decisions/` + `learnings/` (curated institutional memory, distilled learnings)
**memory/YYYY-MM-DD.md** ≈ daily session logs (raw operational context)
**projects/{id}/context.md** ≈ `businesses/` folders (per-project context)
**data/.json** ≈ `metrics/` (structured data, lookups)

Kelly's session persistence mechanism is files-on-disk: memory, daily logs, project context files. Kelly's soul is loaded at session start and serves the same identity-reload function as CLAUDE.md.

Gas Town Equivalent

Gas Town's Beads, backed by Dolt, serve as the universal data plane. Every unit of work, coordination, message, quality gate, and reasoning is a Bead. Git stores What/Where/Who/How; Beads store Why. The Dolt backing means the entire work history is git-versioned and SQL-queryable.

Gap Analysis

System	Brain Substrate	Persistence	Queryable	Human-Readable
Yuki AI CEO	GitHub repo + files	Git commits	No (grep only)	Yes
Kelly	soul + memory + files	Files on disk	No (grep only)	Yes
Gas Town	Beads in Dolt	Git-versioned SQL	Yes (SQL)	Partial (requires viewer)

Kelly gap vs Yuki: Kelly's soul is equivalent to CLAUDE.md, but Kelly does not systematically load project context files at session start the way Yuki loads per-business CLAUDE.md files. The AI CEO experiment suggests every sub-agent should have its own persistent identity/context file that loads on instantiation — not just the main router. Kelly's sub-agents are ephemeral by default; Yuki demonstrates the value of per-agent persistent context even for narrow workers.

Kelly gap vs Gas Town: Both Yuki and Kelly use file-based persistence (not SQL-queryable). Gas Town's Beads/Dolt substrate is strictly more powerful for cross-project queries and audit trails, but requires running database infrastructure. Kelly's file-based approach is more immediately deployable.

Key Insight

The Yuki experiment proves that repo-as-brain is not optional — it's the foundational requirement for an autonomous agent. Without it, every session starts as a stranger. With it, each session starts from a higher baseline than the one before. This validates Kelly's 5-layer memory system from first principles.

3. Autonomous Compounding Loops vs GUPP Hooks vs RALPH Protocol

Yuki AI CEO Pattern

Three autonomous loops running on production (as of Board Review #3, April 2026):

**New AI Models** (daily, 3am) — scans for new AI models, evaluates them, opens PRs with integration code
**Bug Autofix** (daily, 6am) — reads error logs, diagnoses root causes, writes fixes, pushes to main (guardrailed to error-handling code only)
**SEO Optimizer** (weekly) — pulls search console data, finds high-impression/low-CTR pages, rewrites meta tags, creates missing pages, measures impact, auto-reverts bad changes

Crucially: each loop reads its own prior outputs. The SEO optimizer avoids pages it already improved. The model loop learns which model types perform well. They compound — the first real version of "an AI agent that gets better at specific tasks over time, without anyone asking."

Kelly Equivalent

Kelly's RALPH protocol governs sub-agent failures: 3 attempts max, same error twice = escalate immediately. Kelly's sessions_yield handles cooperative multitasking between agents. heartbeat provides periodic agent self-check-in. TaskFlow coordinates multi-step detached tasks. Cron/scheduled tasks handle time-based work. None of these read their own prior outputs and compound. Kelly's scheduled tasks are point-in-time health checks, not accumulated learning loops.

Gas Town Equivalent

GUPP (Gas Town Universal Propulsion Principle): "if there is work on your hook, you MUST run it." The Deacon daemon patrols hooks, kills stuck agents, re-queues their Beads. GUPP handles throughput (relentless execution) but not compounding (learning from prior outputs). A GUPP loop that always runs the same task on the same data produces the same output every time. Yuki's loops are GUPP loops WITH compounding — the output becomes part of the input for the next run.

Gap Analysis

System	Loop Mechanism	Compounding	Reads Prior Outputs	Scheduled
Yuki AI CEO	n8n + agent sessions	Yes	Yes	Yes (daily/weekly)
Kelly	sessions_yield + RALPH + cron	No	No	Yes (cron)
Gas Town	GUPP + hook queues + Deacon	No	No (Beads store Why, not learned state)	Yes (hook patrol)

Kelly gap: Kelly has no equivalent to autonomous compounding loops. A Kelly-style "GUPP loop with compounding" would be a persistent background agent that, on a schedule, pulls its own prior output, reads metrics, and takes the next bounded step. Example: daily build-health check that reads yesterday's CI results and opens issues for failures; weekly KB refresh that reads recent decisions and updates memory.

Gas Town gap: Gas Town's GUPP handles relentless execution but not compounding. Yuki's loops demonstrate that the value of a scheduled loop is proportional to how much it reads its own prior outputs. A GUPP loop that reads its Bead history and adjusts its next action is more valuable than one that doesn't.

Key Insight

The Yuki experiment is GUPP made concrete and compounding. Kelly's sessions_yield and Gas Town's GUPP are execution models; Yuki's loops are the first real-world demonstration that scheduled autonomous loops that read their own outputs are the difference between "AI that helps" and "AI that operates."

5. Progressive Disclosure vs Kelly's Demand-Loaded Layer System vs MEOW Graph

Yuki AI CEO Pattern

CLAUDE.md shrank 36% (152 → 98 lines) as knowledge was extracted into demand-loaded subfiles. Detailed knowledge moved to docs/, decisions/, learnings/. The main file stayed small and pointed to subfiles when needed. Repo doubled in size (472 → 934 files) but the attention footprint shrank.

Quote from Board Review #3: "Give agents a map, not an encyclopedia." OpenAI's instructional analysis paper calls this progressive disclosure.

Kelly Equivalent

Kelly's 5-layer memory system is literally the same pattern:

Small map at the top (soul, memory) — loads every session
Large encyclopedia below (memory/*.md, projects/*/context.md, data/.json) — demand-loaded when needed

Kelly's layer system was designed for this. Yuki's experiment validates it empirically.

Gas Town Equivalent

MEOW's knowledge graph enables progressive disclosure via graph traversal: a new agent can reconstruct a project's full decision history by traversing the Bead DAG. The graph IS the map; individual Beads are the encyclopedia entries. Typed edges (parent/child, causal) guide which entries are relevant. MEOW's approach is structurally different — the "map" is the traversal path, not a separate small file.

Gap Analysis

System	Map Mechanism	Encyclopedia Mechanism	Attention Footprint
Yuki AI CEO	Small CLAUDE.md (98 lines)	Demand-loaded subfiles (docs/, decisions/, learnings/)	Shrinks as knowledge grows
Kelly	Small soul + memory	Demand-loaded memory/.md, projects/, data/	Shrinks by design (5 layers)
Gas Town	Graph traversal path	Individual Beads (accessible via Dolt query)	Controlled by query scope

No meaningful gaps. All three systems independently arrived at the same progressive disclosure principle. Kelly's 5-layer system and Yuki's CLAUDE.md shrinkage are the same pattern. Gas Town's MEOW graph achieves the same goal through a different mechanism (structured traversal vs file pointers).

Key Insight

Progressive disclosure is not a design preference — it's a necessary mechanism for knowledge to compound without causing context overflow. More knowledge doesn't have to mean a bigger attention footprint if the knowledge is demand-loaded, not always-loaded. Yuki's empirical demonstration (CLAUDE.md shrank 36% while repo doubled) is the clearest validation of this principle available.

7. Board Reviews as TEA-Equivalent Gate Reviews

Yuki AI CEO Pattern

Quarterly board reviews (Board Reviews #1, #2, #3) serve as the formal quality gate and strategic checkpoint:

**Board Review #1** (January): Establish identity, repo structure, authority matrix, decision log. Initial autonomy estimate: 15%.
**Board Review #2** (March): Infrastructure gains (24/7 server, n8n, email, screen tracking), mistake log, 30-day decision reviews, per-business CLAUDE.md. Autonomy estimate: 20%.
**Board Review #3** (April): Autonomous loops running in production, CLAUDE.md shrinkage via progressive disclosure, memory experiments, 0 real-time disagreements. Autonomy estimate: implicit increase (loops now run without founder present).

Board reviews are public, version-controlled documents that the founder reads and responds to. They function as a structured gate: the AI presents what it has done, what it has learned, what it can't do yet. The founder responds with feedback and authority expansions.

Kelly Equivalent

Kelly's TEA audit (Test, Evaluate, Assess) at the Testing stage is the closest analog — a structured three-phase quality gate with named outputs (tea-summary.md) and explicit gate decisions (PASS / PASS-WITH-FOLLOWUPS / REMEDIATE). Kelly's pipeline stage gates (READY/NOT-READY before Research, PASS/FAIL before Release) also serve a board-review-like checkpoint function. The key difference: TEA is a quality gate for a specific deliverable; Yuki's board reviews are strategic reviews of the entire system's state.

Gas Town Equivalent

Gas Town's Witness role serves a continuous quality auditor function — watching all workers, not just at release gates. Gas Town also has Mayor review loops (Bezos-style review where the Mayor surfaces what matters). The closest Gas Town analog to Yuki's board reviews is the Mayor's editorial surfacing to the human — but Gas Town has no formalized periodic strategic review equivalent.

Gap Analysis

System	Gate Mechanism	Frequency	Who Runs It	Output
Yuki AI CEO	Board Review	Quarterly	Founder reads; AI writes	Autonomy expansion or reaffirmation
Kelly	TEA audit (Test/Evaluate/Assess)	Per pipeline stage (before Release)	test-lead agent	tea-summary.md → PASS/FAIL
Gas Town	Witness (continuous) + Mayor editorial	Continuous + as-needed	Deacon/Witness daemons; Mayor	Bead state transitions

Kelly gap: Kelly's TEA is a quality gate for deliverables, not a strategic review of the system's autonomous growth. Kelly has no equivalent to Yuki's quarterly board review that explicitly assesses how much autonomy the system has earned and what the roadmap to more looks like. Adding a periodic "system autonomy review" — What has each agent earned the right to do independently? What stalled? What should be expanded? — would close this gap.

Gas Town gap: Gas Town has no formalized periodic strategic review. The Mayor's editorial filtering is continuous but not structured as a review with explicit decisions about autonomy expansion.

Key Insight

Yuki's board reviews are the TEA audit's strategic counterpart. TEA validates that work is correct; board reviews validate that the system is growing. Both are necessary — TEA for quality, board reviews for autonomy. Kelly needs both.

9. Per-Business CLAUDE.md as Pipeline Specialization Equivalent

Yuki AI CEO Pattern

Board Review #2 introduced per-business CLAUDE.md files: each product has its own local CLAUDE.md with rules specific to that product. Main file stays focused on mission, cadence, and cross-portfolio rules. Rule: main todo stays under 80 lines. Anything more specific goes one level deeper.

This is specialization at the context level: the same AI CEO loads different context depending on which product it's working on. The AI CEO becomes a multi-domain agent by having domain-specific context files.

Kelly Equivalent

Kelly's pipeline specialization is at the stage level — research-lead, project-lead, test-lead each handle their phase. Kelly does not have per-project or per-product specialized context files that load when the agent enters that project's scope. Kelly's projects/{id}/context.md files exist but are not CLAUDE.md-equivalent (they describe project state, not agent role and rules for that project).

Gas Town Equivalent

Gas Town's Crew members are named, persistent, per-domain agents with accumulated context. A PR Sheriff has different context and capabilities than a DB Sheriff. Crew persistence means they accumulate domain knowledge over time. This is structurally similar to Yuki's per-business CLAUDE.md — the domain specialization is persistent, not re-loaded from scratch each session.

Gap Analysis

System	Specialization Mechanism	Context Persistence	Scope
Yuki AI CEO	Per-business CLAUDE.md files	Yes (loaded on session start per business)	Product-level
Kelly	Named lead agents per stage (research-lead, etc.)	Partial (ephemeral sub-agents)	Pipeline stage
Gas Town	Named Crew members (PR Sheriff, DB Sheriff)	Yes (persistent per-Rig agents)	Domain-level

Kelly gap: Kelly's pipeline specialization is horizontal (stage-based); Yuki's per-business CLAUDE.md is vertical (product-based). Kelly should consider per-project context files that function like a project's CLAUDE.md: loaded when entering that project's pipeline scope, containing product-specific rules, priorities, and context that the general soul doesn't cover.

Key Insight

Yuki's per-business CLAUDE.md demonstrates that specialization can live in context files rather than in distinct agent instances. The same AI CEO operates in different products by loading different context. Kelly could achieve the same effect by making project CLAUDE.md files first-class — loaded when the router assigns work to a project, not just as passive state files.

Summary: Concept Cross-Reference Table

Concept	Yuki AI CEO	Kelly	Gas Town	Kelly Gap
Brain/memory substrate	GitHub repo + CLAUDE.md + decisions/	soul + memory + 5-layer system	Beads in Dolt (git-versioned SQL)	Sub-agents need per-agent identity files
Authority model	Explicit 3-tier matrix + transfer log	Gate validation (implicit tiers in AGENTS.md)	Role hierarchy (Mayor/Crew/Polecats)	Missing explicit authority matrix with transfer log
Autonomous loops	3 production loops (models, bugs, SEO) + compounding	None (cron/TaskFlow are non-compounding)	GUPP hooks (non-compounding)	Missing autonomous compounding loops
Memory format	Narrative > tables (empirical); hybrid	5-layer (narrative top, structured bottom)	Beads-as-Why (structured Why field)	Kelly design validated; Gas Town risk of over-structuring
Progressive disclosure	CLAUDE.md shrunk 36% (152→98 lines)	5-layer demand-loaded system	MEOW graph traversal	Already aligned — validation of Kelly design
Mechanical vs strategic separation	n8n vs agent sessions	sessions_yield vs cron	Implicit in hook model	Already aligned — Kelly design confirmed
Periodic quality review	Quarterly board reviews	TEA audit (per-stage, not periodic)	Continuous Witness + Mayor editorial	Missing periodic strategic autonomy review
Mistake log	Public, version-controlled learnings file	SELF_IMPROVEMENT.md + memory (partial)	Bead reason fields (diffuse)	Need explicit public/version-controlled mistake log
Per-domain context	Per-business CLAUDE.md files	Project context files (not CLAUDE.md-equivalent)	Named Crew members (persistent domain agents)	Need per-project CLAUDE.md-style context files
Temporal outcome review	30-day decision reviews	None	None	Missing 30-day outcome review mechanism

[[yukicapital-ai-ceo-overview]], [[yukicapital-ai-ceo-experiment]], [[yukicapital-board-review-2]], [[yukicapital-board-review-3]], [[kelly-gas-town-gap-analysis]], [[multi-factory-comparison]], [[steve-yegge-beads]], [[steve-yegge-gupp]], [[steve-yegge-meow]], [[steve-yegge-hierarchy]], [[kelly-handbook-multi-agent]], [[kelly-handbook-software-factory]]

Yuki AI CEO vs Kelly Factory vs Gas Town — Full Gap Analysis

1. Repo-as-Brain vs soul+memory vs Beads Substrate

Yuki AI CEO Pattern

Kelly Equivalent

Gas Town Equivalent

Gap Analysis

Key Insight

3. Autonomous Compounding Loops vs GUPP Hooks vs RALPH Protocol

Yuki AI CEO Pattern

Kelly Equivalent

Gas Town Equivalent

Gap Analysis

Key Insight

5. Progressive Disclosure vs Kelly's Demand-Loaded Layer System vs MEOW Graph

Yuki AI CEO Pattern

Kelly Equivalent

Gas Town Equivalent

Gap Analysis

Key Insight

7. Board Reviews as TEA-Equivalent Gate Reviews

Yuki AI CEO Pattern

Kelly Equivalent

Gas Town Equivalent

Gap Analysis

Key Insight

9. Per-Business CLAUDE.md as Pipeline Specialization Equivalent

Yuki AI CEO Pattern

Kelly Equivalent

Gas Town Equivalent

Gap Analysis

Key Insight

Summary: Concept Cross-Reference Table

Related Articles