Beads vs Kelly Pipeline Assessment — Can Beads Replace Kelly's State Tracking?¶

Assessor: Carson (dark-factory-kb subagent)
Date: 2026-04-26
Sources: steve-yegge-beads, steve-yegge-meow, steve-yegge-gas-town, Kelly pipeline documentation (METHOD.md, STAGES.md, SOURCES.md)

Executive Summary¶

This assessment evaluates how practical Steve Yegge's Beads framework would be as a replacement for the Kelly pipeline's current state tracking infrastructure. Beads (git-versioned, SQL-queryable work primitives built on Dolt) offer compelling advantages in queryability, auditability, and composability — but integrating them into the existing Kelly system involves non-trivial migration complexity and some genuine trade-offs.

Overall recommendation: Partial Adopt. Beads should be adopted as the audit and knowledge graph layer (replacing TEA audit and memory files), while the Kelly pipeline's existing mechanisms for pipeline state and done markers should be kept in the near term, with migration evaluated after the Beads substrate is established.

Assessment Framework¶

For each Kelly state tracking mechanism, this assessment evaluates:

What changes: How the mechanism would look under a Beads-based system
What breaks: Functionality that Beads cannot replicate or would require significant work to support
What improves: Capabilities that Beads would add or enhance
Implementation complexity: Estimated effort to implement, on a 1–5 scale (1 = trivial, 5 = major rewrite)
Recommendation: Adopt / Partial Adopt / Keep Existing

1. pipeline state¶

What It Is¶

pipeline state is the Kelly pipeline's central state file, tracking which stage each work item is in (Intake, Research, Planning, Implementation, Testing, Release), which subphase is active, and metadata about each stage's completion status.

What Changes Under Beads¶

Each pipeline stage and subphase becomes a Bead rather than a JSON blob. Instead of:

{
  "current_stage": "implementation",
  "subphase": "coding",
  "started_at": "2026-04-26T08:00:00Z"
}

You'd have a Bead with typed fields and explicit state transitions. The pipeline stage is the Bead type; the state machine (pending → running → done/failed) is the Bead lifecycle. The full history of state transitions is preserved in git history.

What Breaks¶

JSON consumers break. Any tooling that reads pipeline state as a flat JSON file (dashboards, CI integrations, monitoring tools) would need to be updated to query Dolt instead. This is non-trivial for external consumers.
Atomic multi-field updates. pipeline state supports atomic multi-field writes (e.g., advance stage AND record timestamp AND update subphase in one operation). Dolt transactions support this, but it's a different mental model.
Simple observability. A human reading pipeline state in an editor sees the full state at a glance. Dolt requires either SQL queries or a UI layer on top.

What Improves¶

Full audit trail. Every state transition is a git commit with author, timestamp, and reason. You can git log your way to any historical state, or run SQL queries across the history.
Cross-pipeline queries. If you have multiple pipelines running, you can query across all of them: "Show me all Beads of type testing that transitioned to done in the last 24 hours across all pipelines."
Branch and experiment. You can branch a Bead sequence to try a different approach, test it, then merge or discard. No equivalent in a flat JSON file.
Attestations attached to state. The Witness quality gate can attach its validation as a typed Bead edge, directly to the pipeline Bead it validated. In pipeline state, validation results are typically stored separately or appended to a log.

Implementation Complexity: 3/5¶

Dolt setup and schema design is non-trivial. Migration of existing pipeline state entries to Beads requires a migration script. The Kelly pipeline would need a Beads adapter layer that translates pipeline operations to Dolt operations.

Recommendation: Partial Adopt¶

Keep pipeline state as a lightweight view/alias on top of the Beads layer in the near term. Build the Beads substrate first; create a pipeline state generator that queries Dolt to produce the JSON view. Once the Beads layer is stable, deprecate the JSON and migrate consumers to direct Dolt queries.

2. done markers Per Subphase¶

What It Is¶

The Kelly pipeline uses explicit DONE markers (typically in memory or a separate marker file) to signal that a subphase has been completed. When a sub-agent finishes its work, it writes a DONE marker that the parent pipeline stage uses as a signal to proceed.

What Changes Under Beads¶

done markers become Bead state transitions. A subphase completing is a Bead moving from running → done. The equivalent of "this subphase is DONE" is a Dolt query: SELECT * FROM beads WHERE type = 'subphase' AND name = 'coding' AND status = 'done'. The git commit that records the state transition IS the DONE marker — immutable, timestamped, authored.

What Breaks¶

Simple grep-based DONE detection. Current done markers can be found with grep -r "DONE" memory/. Beads require a Dolt query or a custom Beads CLI. This makes manual verification harder.
No visual DONE in editor. A human reviewing the pipeline can't just open a file and see done markers. They need a Beads viewer.
Bead overhead for simple cases. If a subphase is just "read this file," creating a Bead for it is heavier than writing a DONE marker.

What Improves¶

Non-repudiation. A Bead state transition cannot be accidentally deleted or silently overwritten (Dolt enforces immutability on finalized Beads). done markers in text files can be accidentally removed or corrupted.
Typed DONE semantics. A Bead's done state can carry metadata: who marked it done, what the outcome was, what the next Bead depends on. A text DONE marker has none of this.
Query across all DONE subphases. You can ask: "Which subphases were marked done in the last hour? By which agents? On which projects?" — a query that is essentially impossible with text done markers.

Implementation Complexity: 2/5¶

done markers are already semantically equivalent to Bead state transitions. Writing a Bead adapter that wraps the existing DONE marker logic is straightforward. The main effort is Dolt schema design and ensuring the adapter is transparent to existing sub-agent code.

Recommendation: Adopt¶

Replace done markers with Bead state transitions. The semantic mapping is natural, the benefits (immutability, auditability, queryability) are significant, and the implementation complexity is low. Existing sub-agent DONE marker code can be wrapped in a thin Bead adapter.

3. TEA Audit Results¶

What It Is¶

The Kelly TEA audit (Thought-Event-Action audit) captures the reasoning chain behind each significant decision: what was the thought process, what event triggered the action, what action was taken, and why. TEA audits are stored as narrative text in memory files and are the primary mechanism for onboarding new agents to a project's history.

What Changes Under Beads¶

TEA audits map almost perfectly to Beads' core value proposition. In the Beads model, every Bead captures Why — the intent, reasoning, and decision history. A Bead's reason field IS the TEA's "Thought" component. The Bead's metadata (author, created_at, parent_bead_id) provides the Event and Action context.

Under full MEOW adoption, every TEA audit item becomes a Bead node in the versioned knowledge graph. The TEA audit trail becomes a DAG of Beads, with typed edges connecting reasoning Beads to action Beads to outcome Beads.

What Breaks¶

Narrative TEA format is lost. Current TEA audits are narrative prose — readable by humans as stories. Beads capture structured fields (reason, type, status, author). A rich narrative TEA can't be fully captured in a Bead's structured fields without losing nuance.
Free-form reasoning. TEA audits allow free-form reasoning about decisions. Beads require typed structure. The translation from narrative TEA to structured Bead fields involves information loss.
Existing TEA tooling. Any tooling that reads and processes TEA audit results (e.g., onboarding generators, forensic tools) would need to be updated to query Beads instead of narrative files.

What Improves¶

Machine-queryable reasoning. "Show me all TEA audits where the reason field contains 'security'" — a SQL query, not a grep across narrative files.
Onboarding via graph traversal. New agents can reconstruct the full decision history of a project by traversing the Bead DAG, not by reading chronologically-ordered narrative files.
TEA + state transitions unified. Currently TEA audits are stored separately from pipeline state. Under Beads, the TEA audit IS the state history. The reasoning and the action are the same object.
Cross-project TEA comparison. You can query: "How did the reasoning patterns in Project X differ from Project Y?" — possible with Beads' SQL interface, very difficult with narrative files.

Implementation Complexity: 3/5¶

The challenge is not technical but conceptual: designing the Bead schema to capture TEA semantics without losing the narrative richness. Requires careful schema design and a migration strategy for existing TEA audits. The MEOW framework provides guidance but Kelly-specific customization is needed.

Recommendation: Adopt (with schema design phase first)¶

TEA audits are the strongest candidate for full Beads adoption. The semantic match is excellent. Begin with a schema design phase: map current TEA audit fields to Bead fields, identify which TEA content is structured (can be typed) vs free-form (needs a notes or reason text field), then implement a migration script for existing TEA audits. The payoff — machine-queryable reasoning history — is worth the investment.

4. heartbeat¶

What It Is¶

heartbeat is Kelly's mechanism for agents to periodically check in, confirming liveness and reporting short-term status. It's a lightweight file that agents update (typically every 30 minutes) with their current activity, pending work, and state.

What Changes Under Beads¶

heartbeat entries become Beads of type heartbeat. Each heartbeat is a short-lived Bead with an author, timestamp, current activity, and a time-to-live (TTL). The Deacon equivalent (or Kelly's heartbeat checker) patrols heartbeat Beads, and if any agent's most recent heartbeat Bead is older than the TTL, it triggers an alert.

What Breaks¶

heartbeat is append-only in practice. Most implementations append a new heartbeat entry rather than overwriting. Beads are immutable once finalized — appending a new Bead is natural, but the immutability constraint means you can't "edit" a heartbeat to correct it (you'd add a new Bead correcting the previous one).
Human readability at a glance. Opening heartbeat in an editor shows the recent heartbeat history in a simple list. A Beads-based heartbeat requires a viewer or SQL query.
Very low overhead. heartbeat is just a text file append. Setting up a Dolt instance just for heartbeats is heavier than necessary for this use case.

What Improves¶

Heartbeat history queryable. "Which agents have missed more than 2 heartbeats in the last 24 hours?" — one SQL query vs parsing a text file.
Heartbeat attribution. Each heartbeat Bead has an author. You know exactly which agent process wrote it.
TTL enforcement is structural, not convention. GUPP's Deacon patrols hooks. A Beads-based heartbeat system would have the Deacon equivalent patrol heartbeat Beads structurally — it's not relying on convention, it's enforcing via the data model.

Implementation Complexity: 2/5¶

Conceptually simple, but the Dolt overhead may not be justified for this specific use case. If the full Beads/Dolt infrastructure is already in place for TEA and done markers, adding heartbeats is trivial. If setting up Dolt from scratch, heartbeat is too lightweight to justify the infrastructure cost.

Recommendation: Keep Existing (near term), Migrate if Dolt is already deployed¶

If the Dolt/Beads infrastructure is deployed for TEA and done markers, migrate heartbeat to Beads as part of the same deployment. If starting fresh on just this component, keep heartbeat — the overhead of Dolt is not worth the marginal improvement for heartbeats alone.

5. Memory Files (memory, memory/YYYY-MM-DD.md)¶

What It Is¶

Kelly's memory system consists of:
- memory — curated long-term memory, distilled learnings, significant decisions, and persistent context for the main session
- memory/YYYY-MM-DD.md — daily raw logs of what happened, session transcripts, and context capture

What Changes Under Beads¶

Under MEOW, memory becomes the versioned knowledge graph. Every daily log entry becomes a set of Beads. memory's curated content becomes queryable Bead content, with the curation itself being a Bead operation (editorial selection from the daily graph).

Concretely:
- memory/2026-04-26.md → a daily-log Bead or set of Beads capturing that day's events, authored by the agent
- memory → a Bead query result: "show me all Beads tagged significant from the knowledge graph, ordered by date"

The daily log pattern maps well to Beads: each day's events are a Bead with child Beads for each session or event. The curator (human or agent) can then pull significant items from the daily Beads into a curated view (equivalent to memory today).

What Breaks¶

Human writing directly to memory. memory is a human-editable file. Agents and humans can write to it directly. Beads require a Dolt write path — can't just echo "decided to use Postgres" >> memory. Needs a Beads CLI or API.
Narrative memory is richer than structured Beads. A human writing in memory can use narrative, emotions, opinions, context that doesn't fit a structured Bead schema. A memory entry can be a story. A Bead is a data point.
Back-referencing and free association. memory entries can reference each other in free-form ("see also the entry from March 15th"). Beads' typed edges are more powerful but require explicit authoring of the edge — you can't accidentally create an implicit association the way narrative writing naturally does.
Longevity concerns. Dolt's git-backed storage is appropriate for structured data. Long-form narrative in Beads is less natural — the reason field captures "why" but not the full narrative voice of memory.

What Improves¶

Queryable memory. "What was the last decision made about the database layer?" — one SQL query, not grep across 6 months of daily logs.
Memory provenance. Every memory Bead has an author and timestamp. You know which agent captured which memory.
Memory recovery. If memory is corrupted or accidentally deleted, it's gone. If the daily Beads are in Dolt, they're git-backed and recoverable.
Cross-agent memory sharing. If multiple agents are writing to the same Dolt instance, they share a knowledge graph. Currently, each agent's memory is siloed.
Onboarding via memory graph traversal. A new agent joining a project can query the knowledge graph: "show me all decisions related to architecture, all bug reports related to performance, all resolved trade-offs" — with typed edges providing structure, not just keyword matching.

Implementation Complexity: 4/5¶

This is the highest-complexity migration because memory files are the most human-facing part of the Kelly system. Humans write to memory and daily logs directly. A Beads migration requires:
1. A human-usable Beads write interface (CLI or web UI)
2. A readable memory-equivalent view generated from Bead queries
3. A migration strategy for existing memory files
4. Careful schema design to capture the right level of structure without losing narrative richness

The complexity is high, but so is the potential: a shared, queryable, versioned knowledge graph across all agents and sessions is a significant upgrade.

Recommendation: Partial Adopt¶

Adopt Beads as the long-term memory substrate, but keep memory and daily logs as the primary human write interfaces in the near term. Build a Beads view layer that generates a memory-equivalent from Bead queries — this preserves the human-friendly write interface while gaining the queryability and durability benefits. As the interface matures, gradually shift to direct Beads writes.

Summary Table¶

Kelly Mechanism	Changes	Breaks	Improves	Complexity	Recommendation
pipeline state	Becomes Bead state transitions	JSON consumers need migration	Full git audit trail, cross-pipeline queries	3/5	Partial Adopt
done markers	Become Bead state transitions	Grep-based detection breaks	Immutable, typed, queryable	2/5	Adopt
TEA audit	Becomes Bead graph (MEOW)	Narrative richness lost	Machine-queryable reasoning history	3/5	Adopt (schema first)
heartbeat	Become heartbeat Beads	Requires Dolt infrastructure	Structured TTL enforcement	2/5	Keep (migrate if Dolt deployed)
memory + daily logs	Become Bead knowledge graph	Human writes require new interface	Shared, queryable, versioned memory	4/5	Partial Adopt

Implementation Roadmap¶

Phase 1: Quick Wins (Low Complexity, High Value)¶

Replace done markers with Bead state transitions (complexity: 2/5)
Begin TEA audit schema design (coordinate with Beads schema)
Deploy Dolt infrastructure (shared by all Phase 1+ components)

Phase 2: Audit Layer (Medium Complexity)¶

Migrate TEA audits to Beads with MEOW graph structure
Create pipeline state view from Dolt queries (keep JSON, build Beads underneath)

Phase 3: Memory Layer (Higher Complexity)¶

Deploy Beads write interface for humans
Generate memory-equivalent from Bead queries
Migrate daily logs to Beads

Phase 4: Pipeline Integration¶

Deprecate pipeline state, migrate consumers to Dolt queries
Evaluate heartbeat migration if not already on Dolt

Architectural Assessment¶

The Beads/MEOW framework is a genuine advance over the Kelly pipeline's current state tracking for the following reasons:

Unified substrate. Instead of five separate mechanisms (pipeline state, done markers, TEA audits, heartbeat, memory files), Beads provide a single substrate. All five can coexist on the same Dolt instance, queryable together. Cross-mechanism queries become possible.
Reasoning as first-class data. Kelly's TEA audit captures reasoning alongside actions, but it's a sidecar. Beads make reasoning (the reason field) the primary citizen — the Bead IS the work item, its git history IS the audit. This is a more powerful model.
Network effects. The more Beads in the system, the more valuable the knowledge graph becomes. A Kelly pipeline with Beads will improve over time as more work items accumulate. A Kelly pipeline with pipeline state and memory files does not.

The main risks are:
- Migration cost. The existing Kelly infrastructure is file-based and human-writable. Moving to Dolt requires infrastructure, tooling, and a migration strategy. This is real cost.
- Human interface regression. memory is extremely human-friendly. A Beads-based equivalent, even with a view layer, is a regression for the human write path until the interface matures.
- Vendor/infra lock-in. Dolt is a specific technology choice. Beads' benefits are tied to the Dolt implementation. If Dolt diverges or becomes unsupported, the migration cost is sunk.

These risks argue for a gradual migration rather than a big-bang rewrite — which is why the partial adopt recommendation is appropriate for most components.

steve-yegge-beads, steve-yegge-meow, steve-yegge-gas-town, steve-yegge-gupp, steve-yegge-hierarchy, steve-yegge-gas-city, steve-yegge-wasteland, steve-yegge-saas-mountain, steve-yegge-gas-town (overview)

Beads vs Kelly Pipeline Assessment — Can Beads Replace Kelly's State Tracking?¶

Executive Summary¶

Assessment Framework¶

1. pipeline state¶

What It Is¶

What Changes Under Beads¶

What Breaks¶

What Improves¶

Implementation Complexity: 3/5¶

Recommendation: Partial Adopt¶

2. done markers Per Subphase¶

What It Is¶

What Changes Under Beads¶

What Breaks¶

What Improves¶

Implementation Complexity: 2/5¶

Recommendation: Adopt¶

3. TEA Audit Results¶

What It Is¶

What Changes Under Beads¶

What Breaks¶

What Improves¶

Implementation Complexity: 3/5¶

Recommendation: Adopt (with schema design phase first)¶

4. heartbeat¶

What It Is¶

What Changes Under Beads¶

What Breaks¶

What Improves¶

Implementation Complexity: 2/5¶

Recommendation: Keep Existing (near term), Migrate if Dolt is already deployed¶

5. Memory Files (memory, memory/YYYY-MM-DD.md)¶

What It Is¶

What Changes Under Beads¶

What Breaks¶

What Improves¶

Implementation Complexity: 4/5¶

Recommendation: Partial Adopt¶

Summary Table¶

Implementation Roadmap¶

Phase 1: Quick Wins (Low Complexity, High Value)¶

Phase 2: Audit Layer (Medium Complexity)¶

Phase 3: Memory Layer (Higher Complexity)¶

Phase 4: Pipeline Integration¶

Architectural Assessment¶

Related Articles¶