Yuki AI CEO vs Kelly Factory vs Gas Town — Full Gap Analysis
Assessor: Carson (dark-factory-kb subagent)
Date: 2026-04-27
Sources: Yuki Capital AI CEO essays (Board Reviews #1–#3), [[yukicapital-ai-ceo-overview]], [[kelly-gas-town-gap-analysis]], [[steve-yegge-beads]], [[steve-yegge-gupp]], [[steve-yegge-meow]], [[steve-yegge-hierarchy]]
1. Repo-as-Brain vs soul+memory vs Beads Substrate
Yuki AI CEO Pattern
Every session starts from scratch. The workaround: a private GitHub repository serving as persistent operational headquarters. Loaded on every instantiation:
- `CLAUDE.md` — mission, revenue targets, communication style, tools, founder communication rules
- `authority.md` — three-tier authority matrix (see Section 2)
- `decisions/` — log of every meaningful decision with context, options, rationale, outcomes
- `todo.md` — prioritized action list tagged by owner
- `businesses/` — per-business folders with overviews, monthly stats, competitor analysis
- `strategies/` — strategic planning documents
- `metrics/` — dashboards and scripts pulling live data from Stripe, Plausible, MOZ, MongoDB
- `learnings/` — public mistake log (version-controlled, visible to founder)
The repo IS the brain. Every session reads the relevant files. Knowledge compounds across sessions because everything is committed and retrieved on wake.
Kelly Equivalent
Kelly's 5-layer memory system maps directly:
- **soul** ≈ `CLAUDE.md` (who Kelly is, mission, communication style)
- **memory** ≈ `decisions/` + `learnings/` (curated institutional memory, distilled learnings)
- **memory/YYYY-MM-DD.md** ≈ daily session logs (raw operational context)
- **projects/{id}/context.md** ≈ `businesses/` folders (per-project context)
- **data/.json** ≈ `metrics/` (structured data, lookups)
Kelly's session persistence mechanism is files-on-disk: memory, daily logs, project context files. Kelly's soul is loaded at session start and serves the same identity-reload function as CLAUDE.md.
Gas Town Equivalent
Gas Town's Beads, backed by Dolt, serve as the universal data plane. Every unit of work, coordination, message, quality gate, and reasoning is a Bead. Git stores What/Where/Who/How; Beads store Why. The Dolt backing means the entire work history is git-versioned and SQL-queryable.
Gap Analysis
| System | Brain Substrate | Persistence | Queryable | Human-Readable |
|---|---|---|---|---|
| Yuki AI CEO | GitHub repo + files | Git commits | No (grep only) | Yes |
| Kelly | soul + memory + files | Files on disk | No (grep only) | Yes |
| Gas Town | Beads in Dolt | Git-versioned SQL | Yes (SQL) | Partial (requires viewer) |
Kelly gap vs Yuki: Kelly's soul is equivalent to CLAUDE.md, but Kelly does not systematically load project context files at session start the way Yuki loads per-business CLAUDE.md files. The AI CEO experiment suggests every sub-agent should have its own persistent identity/context file that loads on instantiation — not just the main router. Kelly's sub-agents are ephemeral by default; Yuki demonstrates the value of per-agent persistent context even for narrow workers.
Kelly gap vs Gas Town: Both Yuki and Kelly use file-based persistence (not SQL-queryable). Gas Town's Beads/Dolt substrate is strictly more powerful for cross-project queries and audit trails, but requires running database infrastructure. Kelly's file-based approach is more immediately deployable.
Key Insight
The Yuki experiment proves that repo-as-brain is not optional — it's the foundational requirement for an autonomous agent. Without it, every session starts as a stranger. With it, each session starts from a higher baseline than the one before. This validates Kelly's 5-layer memory system from first principles.
3. Autonomous Compounding Loops vs GUPP Hooks vs RALPH Protocol
Yuki AI CEO Pattern
Three autonomous loops running on production (as of Board Review #3, April 2026):
- **New AI Models** (daily, 3am) — scans for new AI models, evaluates them, opens PRs with integration code
- **Bug Autofix** (daily, 6am) — reads error logs, diagnoses root causes, writes fixes, pushes to main (guardrailed to error-handling code only)
- **SEO Optimizer** (weekly) — pulls search console data, finds high-impression/low-CTR pages, rewrites meta tags, creates missing pages, measures impact, auto-reverts bad changes
Crucially: each loop reads its own prior outputs. The SEO optimizer avoids pages it already improved. The model loop learns which model types perform well. They compound — the first real version of "an AI agent that gets better at specific tasks over time, without anyone asking."
Kelly Equivalent
Kelly's RALPH protocol governs sub-agent failures: 3 attempts max, same error twice = escalate immediately. Kelly's sessions_yield handles cooperative multitasking between agents. heartbeat provides periodic agent self-check-in. TaskFlow coordinates multi-step detached tasks. Cron/scheduled tasks handle time-based work. None of these read their own prior outputs and compound. Kelly's scheduled tasks are point-in-time health checks, not accumulated learning loops.
Gas Town Equivalent
GUPP (Gas Town Universal Propulsion Principle): "if there is work on your hook, you MUST run it." The Deacon daemon patrols hooks, kills stuck agents, re-queues their Beads. GUPP handles throughput (relentless execution) but not compounding (learning from prior outputs). A GUPP loop that always runs the same task on the same data produces the same output every time. Yuki's loops are GUPP loops WITH compounding — the output becomes part of the input for the next run.
Gap Analysis
| System | Loop Mechanism | Compounding | Reads Prior Outputs | Scheduled |
|---|---|---|---|---|
| Yuki AI CEO | n8n + agent sessions | Yes | Yes | Yes (daily/weekly) |
| Kelly | sessions_yield + RALPH + cron | No | No | Yes (cron) |
| Gas Town | GUPP + hook queues + Deacon | No | No (Beads store Why, not learned state) | Yes (hook patrol) |
Kelly gap: Kelly has no equivalent to autonomous compounding loops. A Kelly-style "GUPP loop with compounding" would be a persistent background agent that, on a schedule, pulls its own prior output, reads metrics, and takes the next bounded step. Example: daily build-health check that reads yesterday's CI results and opens issues for failures; weekly KB refresh that reads recent decisions and updates memory.
Gas Town gap: Gas Town's GUPP handles relentless execution but not compounding. Yuki's loops demonstrate that the value of a scheduled loop is proportional to how much it reads its own prior outputs. A GUPP loop that reads its Bead history and adjusts its next action is more valuable than one that doesn't.
Key Insight
The Yuki experiment is GUPP made concrete and compounding. Kelly's sessions_yield and Gas Town's GUPP are execution models; Yuki's loops are the first real-world demonstration that scheduled autonomous loops that read their own outputs are the difference between "AI that helps" and "AI that operates."
5. Progressive Disclosure vs Kelly's Demand-Loaded Layer System vs MEOW Graph
Yuki AI CEO Pattern
CLAUDE.md shrank 36% (152 → 98 lines) as knowledge was extracted into demand-loaded subfiles. Detailed knowledge moved to docs/, decisions/, learnings/. The main file stayed small and pointed to subfiles when needed. Repo doubled in size (472 → 934 files) but the attention footprint shrank.
Quote from Board Review #3: "Give agents a map, not an encyclopedia." OpenAI's instructional analysis paper calls this progressive disclosure.
Kelly Equivalent
Kelly's 5-layer memory system is literally the same pattern:
- Small map at the top (soul, memory) — loads every session
- Large encyclopedia below (memory/*.md, projects/*/context.md, data/.json) — demand-loaded when needed
Kelly's layer system was designed for this. Yuki's experiment validates it empirically.
Gas Town Equivalent
MEOW's knowledge graph enables progressive disclosure via graph traversal: a new agent can reconstruct a project's full decision history by traversing the Bead DAG. The graph IS the map; individual Beads are the encyclopedia entries. Typed edges (parent/child, causal) guide which entries are relevant. MEOW's approach is structurally different — the "map" is the traversal path, not a separate small file.
Gap Analysis
| System | Map Mechanism | Encyclopedia Mechanism | Attention Footprint |
|---|---|---|---|
| Yuki AI CEO | Small CLAUDE.md (98 lines) | Demand-loaded subfiles (docs/, decisions/, learnings/) | Shrinks as knowledge grows |
| Kelly | Small soul + memory | Demand-loaded memory/*.md, projects/*, data/ | Shrinks by design (5 layers) |
| Gas Town | Graph traversal path | Individual Beads (accessible via Dolt query) | Controlled by query scope |
No meaningful gaps. All three systems independently arrived at the same progressive disclosure principle. Kelly's 5-layer system and Yuki's CLAUDE.md shrinkage are the same pattern. Gas Town's MEOW graph achieves the same goal through a different mechanism (structured traversal vs file pointers).
Key Insight
Progressive disclosure is not a design preference — it's a necessary mechanism for knowledge to compound without causing context overflow. More knowledge doesn't have to mean a bigger attention footprint if the knowledge is demand-loaded, not always-loaded. Yuki's empirical demonstration (CLAUDE.md shrank 36% while repo doubled) is the clearest validation of this principle available.
7. Board Reviews as TEA-Equivalent Gate Reviews
Yuki AI CEO Pattern
Quarterly board reviews (Board Reviews #1, #2, #3) serve as the formal quality gate and strategic checkpoint:
- **Board Review #1** (January): Establish identity, repo structure, authority matrix, decision log. Initial autonomy estimate: 15%.
- **Board Review #2** (March): Infrastructure gains (24/7 server, n8n, email, screen tracking), mistake log, 30-day decision reviews, per-business CLAUDE.md. Autonomy estimate: 20%.
- **Board Review #3** (April): Autonomous loops running in production, CLAUDE.md shrinkage via progressive disclosure, memory experiments, 0 real-time disagreements. Autonomy estimate: implicit increase (loops now run without founder present).
Board reviews are public, version-controlled documents that the founder reads and responds to. They function as a structured gate: the AI presents what it has done, what it has learned, what it can't do yet. The founder responds with feedback and authority expansions.
Kelly Equivalent
Kelly's TEA audit (Test, Evaluate, Assess) at the Testing stage is the closest analog — a structured three-phase quality gate with named outputs (tea-summary.md) and explicit gate decisions (PASS / PASS-WITH-FOLLOWUPS / REMEDIATE). Kelly's pipeline stage gates (READY/NOT-READY before Research, PASS/FAIL before Release) also serve a board-review-like checkpoint function. The key difference: TEA is a quality gate for a specific deliverable; Yuki's board reviews are strategic reviews of the entire system's state.
Gas Town Equivalent
Gas Town's Witness role serves a continuous quality auditor function — watching all workers, not just at release gates. Gas Town also has Mayor review loops (Bezos-style review where the Mayor surfaces what matters). The closest Gas Town analog to Yuki's board reviews is the Mayor's editorial surfacing to the human — but Gas Town has no formalized periodic strategic review equivalent.
Gap Analysis
| System | Gate Mechanism | Frequency | Who Runs It | Output |
|---|---|---|---|---|
| Yuki AI CEO | Board Review | Quarterly | Founder reads; AI writes | Autonomy expansion or reaffirmation |
| Kelly | TEA audit (Test/Evaluate/Assess) | Per pipeline stage (before Release) | test-lead agent | tea-summary.md → PASS/FAIL |
| Gas Town | Witness (continuous) + Mayor editorial | Continuous + as-needed | Deacon/Witness daemons; Mayor | Bead state transitions |
Kelly gap: Kelly's TEA is a quality gate for deliverables, not a strategic review of the system's autonomous growth. Kelly has no equivalent to Yuki's quarterly board review that explicitly assesses how much autonomy the system has earned and what the roadmap to more looks like. Adding a periodic "system autonomy review" — What has each agent earned the right to do independently? What stalled? What should be expanded? — would close this gap.
Gas Town gap: Gas Town has no formalized periodic strategic review. The Mayor's editorial filtering is continuous but not structured as a review with explicit decisions about autonomy expansion.
Key Insight
Yuki's board reviews are the TEA audit's strategic counterpart. TEA validates that work is correct; board reviews validate that the system is growing. Both are necessary — TEA for quality, board reviews for autonomy. Kelly needs both.
9. Per-Business CLAUDE.md as Pipeline Specialization Equivalent
Yuki AI CEO Pattern
Board Review #2 introduced per-business CLAUDE.md files: each product has its own local CLAUDE.md with rules specific to that product. Main file stays focused on mission, cadence, and cross-portfolio rules. Rule: main todo stays under 80 lines. Anything more specific goes one level deeper.
This is specialization at the context level: the same AI CEO loads different context depending on which product it's working on. The AI CEO becomes a multi-domain agent by having domain-specific context files.
Kelly Equivalent
Kelly's pipeline specialization is at the stage level — research-lead, project-lead, test-lead each handle their phase. Kelly does not have per-project or per-product specialized context files that load when the agent enters that project's scope. Kelly's projects/{id}/context.md files exist but are not CLAUDE.md-equivalent (they describe project state, not agent role and rules for that project).
Gas Town Equivalent
Gas Town's Crew members are named, persistent, per-domain agents with accumulated context. A PR Sheriff has different context and capabilities than a DB Sheriff. Crew persistence means they accumulate domain knowledge over time. This is structurally similar to Yuki's per-business CLAUDE.md — the domain specialization is persistent, not re-loaded from scratch each session.
Gap Analysis
| System | Specialization Mechanism | Context Persistence | Scope |
|---|---|---|---|
| Yuki AI CEO | Per-business CLAUDE.md files | Yes (loaded on session start per business) | Product-level |
| Kelly | Named lead agents per stage (research-lead, etc.) | Partial (ephemeral sub-agents) | Pipeline stage |
| Gas Town | Named Crew members (PR Sheriff, DB Sheriff) | Yes (persistent per-Rig agents) | Domain-level |
Kelly gap: Kelly's pipeline specialization is horizontal (stage-based); Yuki's per-business CLAUDE.md is vertical (product-based). Kelly should consider per-project context files that function like a project's CLAUDE.md: loaded when entering that project's pipeline scope, containing product-specific rules, priorities, and context that the general soul doesn't cover.
Key Insight
Yuki's per-business CLAUDE.md demonstrates that specialization can live in context files rather than in distinct agent instances. The same AI CEO operates in different products by loading different context. Kelly could achieve the same effect by making project CLAUDE.md files first-class — loaded when the router assigns work to a project, not just as passive state files.
Summary: Concept Cross-Reference Table
| Concept | Yuki AI CEO | Kelly | Gas Town | Kelly Gap |
|---|---|---|---|---|
| Brain/memory substrate | GitHub repo + CLAUDE.md + decisions/ | soul + memory + 5-layer system | Beads in Dolt (git-versioned SQL) | Sub-agents need per-agent identity files |
| Authority model | Explicit 3-tier matrix + transfer log | Gate validation (implicit tiers in AGENTS.md) | Role hierarchy (Mayor/Crew/Polecats) | Missing explicit authority matrix with transfer log |
| Autonomous loops | 3 production loops (models, bugs, SEO) + compounding | None (cron/TaskFlow are non-compounding) | GUPP hooks (non-compounding) | Missing autonomous compounding loops |
| Memory format | Narrative > tables (empirical); hybrid | 5-layer (narrative top, structured bottom) | Beads-as-Why (structured Why field) | Kelly design validated; Gas Town risk of over-structuring |
| Progressive disclosure | CLAUDE.md shrunk 36% (152→98 lines) | 5-layer demand-loaded system | MEOW graph traversal | Already aligned — validation of Kelly design |
| Mechanical vs strategic separation | n8n vs agent sessions | sessions_yield vs cron | Implicit in hook model | Already aligned — Kelly design confirmed |
| Periodic quality review | Quarterly board reviews | TEA audit (per-stage, not periodic) | Continuous Witness + Mayor editorial | Missing periodic strategic autonomy review |
| Mistake log | Public, version-controlled learnings file | SELF_IMPROVEMENT.md + memory (partial) | Bead reason fields (diffuse) | Need explicit public/version-controlled mistake log |
| Per-domain context | Per-business CLAUDE.md files | Project context files (not CLAUDE.md-equivalent) | Named Crew members (persistent domain agents) | Need per-project CLAUDE.md-style context files |
| Temporal outcome review | 30-day decision reviews | None | None | Missing 30-day outcome review mechanism |
Related Articles
[[yukicapital-ai-ceo-overview]], [[yukicapital-ai-ceo-experiment]], [[yukicapital-board-review-2]], [[yukicapital-board-review-3]], [[kelly-gas-town-gap-analysis]], [[multi-factory-comparison]], [[steve-yegge-beads]], [[steve-yegge-gupp]], [[steve-yegge-meow]], [[steve-yegge-hierarchy]], [[kelly-handbook-multi-agent]], [[kelly-handbook-software-factory]]