Closed Feedback Loop¶
Type: Architectural principle
Referenced from: closed-loop-agent-control
Definition¶
A closed feedback loop in agent architecture means the agent has direct, programmatic access to the target system's state and can act on it without human intermediation. The loop is: agent acts → state updates → agent reads state → agent acts again. This contrasts with human-in-the-loop designs where the agent produces output, a human observes and interprets, then feeds directions back — introducing latency and information loss.
Nikita M.'s formulation: "If I give the agent the right tools and close the feedback loop, even a smaller model with a closed loop can outperform a stronger model that relies on human-observed feedback."
Three Mechanisms That Make It Work¶
-
Latency Collapse — Human-in-the-loop has inherent delay: agent → human observes → human interprets → human articulates → agent receives. With tools, the loop is: agent acts → state updates → agent reads state. Bounded only by tool call roundtrip.
-
Information Fidelity — Human feedback is lossy ("the bird hit the ceiling" loses velocity, position, timing). Structured state access returns complete, parseable data — no interpretation needed.
-
Temporal Control — The agent decides when the world advances (via clock control), transforming from passive observer to active experimenter.
Design Principles (from the experiment)¶
- Give agents structured state access — not screenshots, not logs, structured data they can parse programmatically
- Let agents control execution flow — clock control, pause/resume, step-through
- Minimize the observation-to-action cycle — every tool call should return immediately actionable information
- Embed the endpoint in the target system — not a sidecar, not a proxy, the actual binary
Related Concepts¶
- tool-mediated-game-state-control — the specific tool implementation of this principle
- time-as-knob — temporal control as the most novel mechanism in the loop
- model-size-agnostic-iteration — the implication that loop quality > model quality
- operator-control-patterns — factory's human-in-the-loop version of the same principle
Open Questions¶
- Does this generalize beyond games (deterministic, machine-readable state) to web apps, APIs, business processes?
- What's the minimum viable toolset? (Nikita used 3 tools; would 2 suffice?)
- At what model quality does the loop advantage disappear?