Gap Analysis: dark-factory-kb vs Karpathy LLM Wiki Pattern¶
Updated: 2026-05-29
Previous: 2026-04-27 (score: 19/100)
Change: Major operational improvements — machinery now running
Current KB Snapshot¶
| Metric | Value |
|---|---|
| Total articles | 169 |
| Source articles | ~50 |
| Concept pages | 61 |
| Q&A reports | 9 |
| Log entries | 41 |
| Daily lint | ✅ (last run: 2026-05-29) |
Dimension Scores¶
1. log.md — Score: 8/10 ✅¶
Evidence: compiled/log.md exists with 41 entries. Format is clean: ## [YYYY-MM-DD] type | Title. Types include ingest, health, dedup. Last entry: 2026-05-22. Unix-parsable with grep "^## \[" log.md.
Gap from perfect: No lint entries in log (gates run but not logged). Last ingest entry is May 22 — 7 days gap, meaning no new sources ingested since. No query entries logged (Q&A files exist but the log doesn't record when they were produced).
Old score: 0/10. This was the biggest gap in April. It's now the strongest operational component.
2. index.md / index.html — Score: 4/10¶
Evidence: compiled/sources/index.html exists — hand-crafted HTML with custom CSS (dark theme, DM Mono/DM Sans fonts). Comprehensive catalog of all articles organized by section. Not auto-generated; built manually by agents. No index.md (markdown equivalent) exists.
Gap from perfect: When a new source is ingested, index.html is not automatically updated. The homepage requires manual editing after each ingest. No index.md for LLM to read at query time.
Old score: 4/10. No change — the problem persists but the manual catalog is thorough.
3. Raw sources layer — Score: 6/10¶
Evidence: raw/ contains bmad/, gstack/, kelly/, sources/, steve-yegge/, superada/. Immutability not formally enforced — no SOURCES.md catalog with hashes and dates. raw/kelly-research-transcripts.md is a source doc (not raw). Some sources have been ingested, some haven't.
Old score: 6/10. No change — the directory exists and is well-organized, but the immutability guarantee and catalog are still missing.
4. Concept pages — Score: 5/10¶
Evidence: 61 concept pages in compiled/concepts/. Topics range from agent patterns (kelly-router, ralph-protocol) to memory systems (five-layer-memory) to visualization (tufte-viz-principles). However:
- index.html links to 0 concept pages — the concept layer is invisible from main navigation
- 7 orphan concept pages (tufte-viz articles + ralph-refinements + story-by-story-build)
- No category-level index for concepts (no compiled/concepts/CONCEPT_INDEX.md)
- Backlinks from source articles to concept pages are partial at best
Old score: 4/10. Small improvement — more concept pages exist now, but they're still disconnected from navigation.
5. Lint operation — Score: 10/10 ✅¶
Evidence: lint.py at ~/.openclaw/workspace/kb_shared/tools/lint.py runs daily via cron. Reports exist for May 21, 22, 24, 27, 29. Checks: orphans, dead-ends, completeness, stale references, duplicates, encoding, connection opportunities, missing data imputation. Reports stored in compiled/reports/lint-YYYY-MM-DD.md.
Old score: 0/10. The biggest improvement — lint went from nonexistent to fully operational. Running on schedule.
Current findings (2026-05-29):
- 7 orphans ⚠️
- 4 dead-ends ⚠️
- 1 completeness failure ❌ (kelly-deacon-architecture.md missing bibliography field)
- 31 stale URLs ℹ️ (localhost refs, medium.com 403s, superada.ai 404s, substack 404s)
- 6 duplicate pairs ❌❌ (patterns/ vs sources/ at 100% overlap)
6. Query operation — Score: 7/10¶
Evidence: 9 Q&A reports in compiled/reports/q-*.md:
- q-beads-vs-kelly-pipeline-state.md
- q-cis-ready-criteria.md
- q-gupp-vs-kelly-autonomous-continuation.md
- q-kelly-authority-matrix.md
- q-kelly-autonomous-compounding-loops.md
- q-kelly-cis-pipeline-vs-superada-7-agent-crew.md
- q-kelly-five-layer-memory-system.md
- q-kelly-router-spawn-vs-gas-town-mayor.md
- q-kelly-spawn-protocol.md
Gap from perfect: The log doesn't record when Q&A files were produced — no ## [date] query | Question entries in log.md. Output formats limited to markdown only (no Marp slides, no matplotlib charts, no canvas). No systematic "filed back" pattern — Q&A outputs are in reports/ not integrated back into the concept/source layer.
Old score: 0/10. Massive improvement — the query loop is exercised and producing outputs.
7. Output formats — Score: 2/10¶
Evidence: compiled/sources/ and compiled/reports/ contain only .md files. Auto-generated .html counterparts exist for source articles. No Marp slide decks. No matplotlib images. No canvas outputs.
Old score: 2/10. No change.
8. Schema / METHOD.md — Score: 5/10¶
Evidence: METHOD.md in the dark-factory-kb root describes the pipeline steps (0–5) and quality gates. Section 4 defines the log format: ## [YYYY-MM-DD] type | Title. Section 5 defines lint gate thresholds. However:
- No explicit log-append rule ("after every ingest, append to log.md")
- No index-update rule ("after every ingest, update index.html")
- No concept cross-linking rule in METHOD.md
- No lint-pass trigger described (lint runs via cron, not described in METHOD.md schema)
Old score: 3/10. Slight improvement — log convention is documented, but index and lint rules are missing.
9. Extra tools — Score: 0/10¶
Evidence: No wiki search engine exists. compiled/sources/ is served as static HTML with no search. No CLI tool for LLM to hand a query. No processing scripts beyond lint.py and auto_wire_concepts.py.
Old score: 0/10. No change.
10. Filed-back pattern — Score: 7/10¶
Evidence: Q&A outputs go to compiled/reports/q-*.md and are queryable. The gap analysis itself is filed back. The plantry PRD demo is filed. However:
- Reports are not linked from source articles or concept pages (no related-articles back-reference)
- The main index.html doesn't list Q&A reports as a category
- "Filing back" into the concept/source layer (not just reports/) is not consistently happening
Old score: 0/10. Significant improvement — outputs are being produced and stored. The pattern is partially working.
Summary Scores¶
| # | Dimension | Apr 27 | May 29 | Trend |
|---|---|---|---|---|
| 1 | log.md | 0/10 | 8/10 | ✅ |
| 2 | index.md/html | 4/10 | 4/10 | → |
| 3 | Raw sources | 6/10 | 6/10 | → |
| 4 | Concept pages | 4/10 | 5/10 | ↑ |
| 5 | Lint operation | 0/10 | 10/10 | ✅ |
| 6 | Query operation | 0/10 | 7/10 | ✅ |
| 7 | Output formats | 2/10 | 2/10 | → |
| 8 | Schema/METHOD.md | 3/10 | 5/10 | ↑ |
| 9 | Extra tools | 0/10 | 0/10 | → |
| 10 | Filed-back pattern | 0/10 | 7/10 | ✅ |
April 2026 total: 19/100
May 2026 total: 53/100
Improvement: +34 points
The KB went from "raw materials exist but nothing runs" to "core machinery running daily." The three biggest improvements were: log.md created, lint.py operational, query loop exercised.
Critical Issues (Must Fix)¶
🔴 Duplicate files — patterns/ vs sources/ (100% overlap)¶
6 article pairs are duplicates at 100% overlap:
| patterns/ | sources/ |
|---|---|
| auto-spawn-chain.md | auto-spawn-chain.md |
| beads-adoption.md | beads-adoption.md |
| done-marker-dual-write.md | done-marker-dual-write.md |
| operator-control-patterns.md | operator-control-patterns.md |
| ralph-refinements.md | ralph-refinements.md |
| story-by-story-build.md | story-by-story-build.md |
Action: Deduce — pick one canonical location (likely compiled/sources/), remove the compiled/patterns/ copies, update any wikilinks that reference patterns/ to point to sources/. The patterns/ directory should not exist as a parallel duplicate store.
🔴 31 broken URLs¶
Major categories:
- localhost:18789 — development URLs left in published articles
- steve-yegge.medium.com — 403 (requires auth or blocked)
- superada.ai/skills/* and superada.ai/crew/* — 404 (site restructured)
- simonw.substack.com/p/agentic-engineering-patterns — 404 (post removed?)
- sharpener.dev/ — 404 (site down or changed)
- x.com/i/article/ links — 403 (Twitter/X article blocks)
- github.com/bmad-code-org/BMAD-METHOD — 404 (repo moved or deleted)
Action: For each broken URL: either find the current URL and update, or remove the dead link. Medium.com 403s can be replaced with cached versions or marked as unavailable.
🟠 7 orphan articles¶
Articles with 0 inbound links:
- compiled/patterns/ralph-refinements.md
- compiled/patterns/story-by-story-build.md
- compiled/concepts/tufte-viz-principles.md
- compiled/concepts/tufte-viz-skill-origin.md
- compiled/concepts/tufte-viz-pre-post-demos.md
- compiled/sources/ralph-refinements.md
- compiled/sources/story-by-story-build.md
After dedup, 2 of these will auto-resolve. The tufte-viz concept pages need wiring into articles that reference them.
🟠 1 completeness failure¶
compiled/concepts/gas-town-daemons/kelly-deacon-architecture.md missing bibliography field. Add the field or suppress the check.
Improvement Priorities (Ranked)¶
- Dedup patterns/ vs sources/ — Remove 6 duplicate files. Update wikilinks. Delete patterns/ directory.
- Fix 31 broken URLs — Categorize by cause (site down, auth required, moved), update or remove.
- Wire tufte-viz concept pages — Find articles that mention tufte-viz and add wikilinks.
- Add log entries for Q&A sessions — Append
## [date] query | Questionwhen producing Q&A reports. - Auto-generate index.html on ingest — Add
generate_index.pycall to the ingest workflow so new articles appear without manual edit. - Add CONCEPT_INDEX.md — Category-organized catalog of all 61 concept pages with one-line summaries.
- Build basic wiki search tool — Even a grep-based CLI tool for LLM to use as a query assistant.
- Add SOURCES.md catalog — Document all raw sources with hashes and dates for immutability tracking.
Relation to Karpathy Pattern¶
The KB now implements the core LLM Wiki machinery:
| Operation | Status |
|---|---|
| Raw sources layer | ✅ Exists |
| LLM-generated wiki layer | ✅ Exists |
| Schema (METHOD.md conventions) | ✅ Partial |
| log.md append-only | ✅ Running (41 entries) |
| Lint (health checks) | ✅ Running daily |
| Query against wiki | ✅ 9 Q&A reports produced |
| Filed back outputs | ✅ In reports/, partially |
| index.md auto-update | ❌ Manual |
| Extra tools (search) | ❌ None |
| Multi-format output | ❌ md + HTML only |
The KB has graduated from "inactive" to "operational." The remaining gaps are refinement, not foundational.
Related Articles¶
karpathy-llm-wiki, kelly-gas-town-gap-analysis, kelly-router, kelly-handbook-ch8-memory, ralph-protocol, five-layer-memory-system
Generated: 2026-05-29. Lint data from reports/lint-2026-05-29.md.