Date Compiled: 2026-04-28
Type: concept — factory-methodology
Related Questions: autonomous-cron-management, meta-agents, ai-fleet-management, fleet-monitoring, cron-automation
Related Articles: superada-enterprise-operations, superada-multi-agent-architecture, lobster-pipelines, super-ada


Meta-Crons — The Four Agents That Keep the Fleet Alive

What Are Meta-Crons?

Meta-crons are cron jobs that manage other cron jobs. When operating at fleet scale (~98 active autonomous tasks across the Enterprise Crew), ordinary monitoring and recovery is insufficient — broken crons need to be fixed, not just reported. The Enterprise Crew built a meta-layer of four dedicated cron agents, each with a distinct function.

These four meta-crons run continuously and form the immune system of the fleet — the layer that catches failures, enforces optimization, and keeps the system running when API limits, broken integrations, and silent failures would otherwise degrade performance.

The Four Meta-Crons

1. The Model Orchestrator ("The Fixer") — Every 6 Hours

Function: Dynamic load balancer and crisis responder for the model's API layer.

Behavior:
- Health Check: Runs check-providers.sh to ping core LLM API endpoints. Records latency and status into provider-status.json
- Audit: Lists all enabled crons, auto-updates deprecated model references to current versions
- Recovery: On rate_limit/429/All models failed errors — checks provider health, re-runs the failed cron immediately
- Delivery Fixing: Detects "thread not found" errors and auto-updates delivery target to #mail-room fallback channel

Key principle: Fixes problems immediately. Does not log and queue — fixes now.

2. The Cron Watcher ("The Architect") — Every 6 Hours

Function: Unified cron governance engine. Handles structural optimization across the fleet.

Behavior:
- Lobster Detection: Reads cron payloads. If 3+ deterministic sequential steps detected, tags the cron as LOBSTER_CANDIDATE
- Clustering: Groups small single-step crons with the same schedule into cluster crons (reducing compute and context overhead)
- Auto-Implementation: Takes a JSON snapshot of current state, applies safe fixes, logs a rollback manifest before making changes

Safety constraint: Only safe operations are auto-implemented — batch delivery migration, Lobster conversion, model reference fixes, cluster merges. The snapshot + rollback manifest ensures any change can be undone.

3. The Daily Health Report ("The Watchdog") — Twice Daily (8am and 8pm UTC)

Function: Passive scan of the full fleet. Posts an emoji-coded status report to Discord #medbay.

Flags:
- lastStatus = error → 🔴 FAILING
- lastDurationMs > 5 minutes → 🟡 SLOW (potential infinite loop)
- Should have run but has no lastRunAtMs → 🟡 NEVER-RAN
- Model name typos → 🟡 TYPO

Design: Unlike the Model Orchestrator (which fixes things), the Health Report is passive — it surfaces issues that require human review or that the Orchestrator is not authorized to auto-fix.

4. The Auto-Lobster Converter — Sundays at 2:00 AM UTC

Function: Dry-run validator for Lobster pipeline readiness.

Behavior:
- Scans for crons tagged LOBSTER_CANDIDATE
- Runs lobster-converter-core.sh validation
- Posts summary report to #upgrades (dry-run mode — does not modify crons)

Design: This meta-cron never modifies anything. It validates and reports for human review. The actual Lobster migration is a human decision.


External Watchdogs (n8n)

n8n (self-hosted workflow engine) is monitored externally by OpenClaw meta-crons. The cluster:hourly-ops includes an n8n-health step: pings the n8n API externally.

Principle: Don't run watchers inside the system being watched. If n8n goes down and the watcher is running on the same n8n instance, the watcher goes down too.

If n8n goes down, Ada detects it via the external monitor and alerts in Discord #upgrades or #medbay before Henry notices.


Operational Insights

  1. Auto-fix, don't just log — The Model Orchestrator is authoritative: it fixes now, not after a human review
  2. Rollback manifests before changes — The Cron Watcher snapshots before every auto-implementation so fixes are reversible
  3. Watchdogs must be external — Don't monitor your monitoring system from inside itself
  4. Cron governance compounds — The Cron Watcher makes the fleet progressively more efficient each run (Lobster candidates tagged, clusters merged, deprecated models updated)
  5. Health reports are passive — They're for issues the Orchestrator isn't authorized to auto-fix (human review needed)

  • lobster-pipelines — what the Auto-Lobster Converter identifies and validates for migration
  • superada-enterprise-operations — the broader context of fleet operations and 98 active crons
  • Kelly router — different approach to fleet monitoring in dark-factory-kb's pipeline
  • superada — the orchestrator that owns and operates the fleet under this policy

Sources: blog-managing-crons.md (raw/superada/), superada-enterprise-operations.md (compiled/sources/)