Open Knowledge Format (OKF) — Google Launch¶
Type: Source / Specification
Date: 2026-06-13
Sources: Google Cloud blog post + GitHub SPEC.md (v0.1 Draft) + Marie Haynes (@Marie_Haynes Twitter)
Tier: 1 (institutional primary)
What Is OKF¶
The Open Knowledge Format (OKF) is an open, vendor-neutral specification for representing knowledge as a directory of markdown files with YAML frontmatter. It formalizes the "LLM wiki" pattern — markdown + frontmatter knowledge bases that agents can read, write, and interlink — into a portable, interoperable standard.
OKF v0.1 (Draft) was published by Google's Cloud Data Analytics team on 2026-06-13. The stated goal: give the AI agent ecosystem a common knowledge representation format so that knowledge written by one producer (human, pipeline, or agent) can be consumed by any other consumer without translation.
The core philosophy: minimal conventions, maximum portability. If you can cat a file, you can read OKF. If you can git clone a repo, you can ship it.
Technical Specification¶
Directory / File Structure¶
An OKF bundle is a directory tree of .md files. The structure is producer-defined — no mandated taxonomy.
my_bundle/
├── index.md # Optional. Directory listing for progressive disclosure.
├── log.md # Optional. Chronological update history.
├── concept-at-root.md # A concept at the bundle root.
└── subdirectory/
├── index.md # Optional subdirectory listing.
├── concept-a.md
└── subdirectory/
└── …
Concept ID = the file's path within the bundle, with the .md suffix removed.
Example: tables/orders.md → concept ID tables/orders.
Reserved filenames (must not be used for concept documents):
- index.md — directory listing
- log.md — update history
Concept File Format¶
Every concept is one UTF-8 markdown file with two parts:
- YAML frontmatter — delimited by
---on its own line at the top - Markdown body — everything after the frontmatter
Required frontmatter field:
- type — a short string identifying the kind of concept. Consumers use this for routing, filtering, and presentation. Examples: BigQuery Table, Metric, Playbook, API Endpoint. Values are not centrally registered; consumers must tolerate unknown types gracefully.
Recommended frontmatter fields (in priority order):
- title — human-readable display name
- description — one-sentence summary
- resource — canonical URI for the underlying asset
- tags — YAML list of short strings for cross-cutting categorization
- timestamp — ISO 8601 datetime of last meaningful change
Extensions: Producers MAY include any additional keys. Consumers SHOULD preserve unknown keys when round-tripping.
Conventional body section headings (optional, but recommended when applicable):
- # Schema — structured description of an asset's columns/fields
- # Examples — concrete usage examples (fenced code blocks)
- # Citations — external sources backing claims, numbered
Example concept file:
---
type: BigQuery Table
title: Customer Orders
description: One row per completed customer order across all channels.
resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders
tags: [sales, orders, revenue]
timestamp: 2026-05-28T14:30:00Z
---
# Schema
| Column | Type | Description |
|---------------|-----------|------------------------------------------|
| `order_id` | STRING | Globally unique order identifier. |
| `customer_id` | STRING | FK into [customers](/tables/customers.md). |
| `total_usd` | NUMERIC | Order total in US dollars. |
Part of the [sales dataset](/datasets/sales.md).
Linking¶
OKF supports two forms of standard markdown links:
-
Root-relative paths (recommended): links beginning with
/are resolved relative to the bundle root.
See the [customers table](/tables/customers.md) for the join key. -
Standard relative paths:
./other.md,../other.md, etc.
Links are untyped — the relationship kind (parent/child, joins-with, depends-on) is conveyed by surrounding prose, not by the link itself. Consumers building a graph view treat all links as directed edges of an untyped relationship.
Broken links are tolerated, not treated as errors. A link to a not-yet-written concept is fine.
How Agents Consume It¶
OKF requires no specific tooling. Agents consume it by:
- Reading .md files directly (no SDK required)
- Parsing YAML frontmatter for structured fields
- Traversing markdown links to build a knowledge graph
- Optional: Google's reference enrichment agent (BigQuery → OKF draft), or static HTML visualizer
Google has updated Knowledge Catalog (their enterprise catalog product) to ingest OKF and serve it to agents.
Distribution¶
A bundle MAY be distributed as:
- A git repository (recommended — history, attribution, diffs)
- A tarball or zip archive
- A subdirectory within a larger repository
Conformance (v0.1)¶
A bundle is conformant if:
- Every non-reserved .md file has a parseable YAML frontmatter block
- Every frontmatter block contains a non-empty type field
- Reserved filenames follow their specified structure when present
Consumers MUST NOT reject a bundle for: missing optional fields, unknown type values, unknown additional frontmatter keys, broken cross-links, or missing index.md files. This permissive model is intentional — OKF is meant to remain useful as bundles grow, get refactored, and are partially generated by agents.
Google's Reference Implementations¶
- Enrichment agent — walks a BigQuery dataset, drafts an OKF concept document per table/view, then crawls authoritative docs to enrich with citations, schemas, and join paths.
- Static HTML visualizer — turns any OKF bundle into an interactive graph view in a single self-contained HTML file; no backend.
- Sample bundles — GA4 e-commerce, Stack Overflow, Bitcoin public datasets — produced by the reference agent and committed to the repo as living conformant examples.
USEFULNESS ANALYSIS¶
What Is OKF (in one paragraph)¶
OKF is a vendor-neutral open specification that represents knowledge as a directory tree of markdown files with YAML frontmatter. It formalizes the "LLM wiki" pattern — a practice that had organically emerged across tools like Obsidian, Karpathy's LLM-Wiki gist, AGENTS.md/CLAUDE.md convention files, and various agent knowledge bases — into a standardized, interoperable format. The only required field is type; everything else is producer-defined. Links are standard markdown links with /-prefixed root-relative paths. No SDK is required to produce or consume. OKF's stated purpose is to give the AI agent ecosystem a common knowledge representation so that knowledge written by one party can be consumed by any other without translation layers.
How It Compares to What We Already Have¶
dark-factory-kb itself:
Our KB already uses the OKF pattern organically — markdown files with YAML frontmatter, directory-organized concepts, index.md directory listings, log.md update histories, and cross-links between concepts. The frontmatter fields overlap substantially (id/type, title, source/resource, tags, date/timestamp, key_claims/description). The main structural difference is our use of wikilink notation versus OKF's standard markdown links with root-relative paths.
kb_shared/ tooling:
kb_shared compiles markdown sources into HTML for deployment. It could theoretically emit OKF-compliant directories as an alternative output target — the source format is the same, just with different conventions for linking and a slightly different frontmatter schema.
Obsidian / Notion:
These are consumer-side tools with proprietary storage. Obsidian vaults are close in shape to OKF (markdown + YAML frontmatter + wikilinks), but Obsidian's link format and local attachment handling diverge from OKF's conventions. Notion is API-driven, not file-based, making it less directly compatible.
Karpathy LLM-Wiki:
OKF explicitly cites Karpathy's LLM Wiki gist as the foundational pattern. Karpathy's insight: LLMs don't get bored, don't forget cross-references, and can touch many files in one pass — making the bookkeeping cost of a personal wiki zero for an agent. OKF takes this pattern and adds the minimal standardization needed for interoperability across producers and consumers.
PIP-097 / PIP-100 KB patterns:
These are internal factory patterns for knowledge base construction. If they share the markdown + frontmatter + wikilink structure, they are structurally compatible with OKF. The value OKF adds is an external, documented standard we could align to rather than inventing our own conventions.
What We Could Use It For — Concrete Use Cases¶
1. Export dark-factory-kb to OKF format:
Our KB is already ~80% OKF-compatible. The lift to make it fully conformant is low:
- Convert wikilinks to /-prefixed root-relative markdown links (e.g., some-concept → /concepts/some-concept.md)
- Map our key_claims to OKF's description field
- Map our id to concept ID (file path)
- Add type to every frontmatter block (we have type: source already)
- Add timestamp where missing
Once exported, our KB becomes portable to any OKF-compatible consumer or agent.
2. Extend kb_shared tools to emit OKF-compliant directories:
kb_shared could gain an --okf export mode that:
- Writes compiled sources to a directory tree matching OKF conventions
- Generates index.md files per directory
- Generates log.md entries for each compile batch
- Emits root-relative markdown links instead of HTML-wrapped links
This would make dark-factory-kb an OKF producer, feeding into any OKF-compatible system.
3. OKF as complement to the HTML compile pipeline, not replacement:
The "compile to HTML + deploy" pipeline serves human browsing and search indexing. OKF serves agent consumption. These are complementary, not competing — the same source files could feed both pipelines. A kb_shared compile could produce both HTML output (for human readers) and an OKF bundle (for agent ingestion).
4. Make dark-factory-kb available to Google Knowledge Catalog:
Google has updated Knowledge Catalog to ingest OKF. If we export our KB to OKF format, Ovi's agents could consume it via Google's enterprise catalog. This is a potential future integration path.
5. Interoperability with other OKF producers:
Sample bundles from Google (GA4, Stack Overflow, Bitcoin) are already in the repo. We could consume these as reference knowledge, or cross-link our OKF exports with external OKF bundles.
Limitations¶
- v0.1 Draft: The spec is explicitly a draft. Future versions may introduce breaking changes (major version bump). Bundles should declare their target version in root
index.mdfrontmatter. - No query language: OKF is a representation format, not a query format. All navigation is manual or through agent traversal. There's no SPARQL equivalent.
- Untyped links: Links carry no semantic relationship type — consumers must infer from prose context. This is intentional minimalism, but it limits graph query expressiveness.
- No standard tooling ecosystem: Beyond Google's reference implementations, there's no established producer/consumer tooling ecosystem yet. Early days.
- Consumer tolerance means no validation: Because consumers MUST tolerate non-conformant bundles, there's no pressure to produce perfectly conformant output. The spec is soft by design.
- Is it just what we already do? Yes and no. The pattern is familiar, but OKF's value is the interoperability surface — a documented, shared convention rather than a bespoke one. The lift to adopt is low precisely because we're already doing it.
Verdict¶
Worth pursuing. Priority: High. Time to value: Low.
OKF is not a new pattern — it's the formalization of a pattern we already use. The lift to make dark-factory-kb OKF-compliant is small (primarily a linking convention change), and the upsides are concrete:
- Interoperability — Our KB becomes portable to any OKF-compatible consumer, including Google's Knowledge Catalog and any future OKF tooling ecosystem.
- kb_shared extension — Adding an
--okfexport mode to kb_shared is a straightforward enhancement that opens new integration paths. - Dual pipeline — Compiling to both HTML (human reading) and OKF (agent consumption) from the same sources is a natural fit.
- Ecosystem signal — Google publishing this as an open standard with reference implementations is a strong signal. Early adoption positions us well.
Recommended next steps (in priority order):
1. Convert existing wikilinks to /-prefixed markdown links in compiled sources (lowest lift, immediate OKF compatibility)
2. Add type and timestamp fields to all concept frontmatter (already have type: source on most)
3. Add --okf export mode to kb_shared (medium lift, new capability)
4. Consider a pilot OKF bundle export of a subset of dark-factory-kb to test with Google's visualizer or Knowledge Catalog
The format is simple enough that we can adopt it incrementally without a big migration. The key insight: we haven't been doing anything wrong — we've been doing the same thing OKF standardizes. Now we have a name for it and an interoperability claim to make.
Sources¶
| Source | Type | Date | Key Notes |
|---|---|---|---|
| Marie Haynes (@Marie_Haynes) Twitter | Social / trigger | 2026-06-13 | "Makes it really easy to make a digital brain that agents can use" |
| Google Cloud Blog Post | Official announcement | 2026-06-13 | Framing, problem statement, reference implementations |
| GitHub SPEC.md (knowledge-catalog/okf) | Normative spec | 2026-06-13 (v0.1 Draft) | Full technical specification |
| Karpathy LLM-Wiki gist | Cited inspiration | ~2025 | OKF explicitly formalizes this pattern |