Human-in-the-loop

When to pause, when to proceed

Jun 20, 2026

There's a design choice buried inside every agent system that most teams make implicitly rather than deliberately: at what points, if any, does a human have to say yes before something irreversible happens? Teams that don't ask this question explicitly tend to discover the answer in production, usually after an agent has done something expensive, public, or impossible to undo. Teams that ask it upfront build something different — not less autonomous, but autonomy with a specific, designed boundary.

HITL vs. HotL: Two Different Things

The vocabulary matters here because the two most common oversight patterns are often conflated. Human-in-the-loop means the agent pauses. It cannot proceed without an explicit human response — approve, reject, or redirect. Human-on-the-loop means the agent executes, but a human can observe and intervene in real time. If they don't, execution continues.

HITL is a blocking gate. HotL is a monitoring posture. Both have legitimate uses. The mistake is applying HotL to situations that require HITL — assuming that because a human could intervene, they will, in time, with enough context to make the right call. Agentic AI inverts the traditional review relationship: previously, AI suggested and humans decided; now, AI executes and humans intervene if they catch it. The window for intervention has shrunk from hours to seconds. For actions with large blast radius, HotL is not a substitute.

The Classification Decision: What Needs a Gate

The most common mistake in HITL design is treating it as an all-or-nothing choice: either every action gets a gate (expensive, slow, defeats the purpose of an agent), or nothing does (fast, but exposes the system to all the failure modes above). The right answer is classification — every action in the agent's tool set is categorized by risk, and the oversight pattern is determined by the category.

Fully autonomous — no gate required.

Actions that are low-blast-radius, reversible, and within a well-tested scope. Reading data, searching knowledge bases, drafting content that gets reviewed before sending, calling APIs with no side effects.

These run without any approval mechanism. Gating them would add latency and human burden with no material risk reduction. The test: if this action produces a wrong result, how much does it cost to detect and correct? If the answer is 'quickly and cheaply,' it belongs here.

Audit-log gate — non-blocking, but recorded.

Actions that are consequential but reversible, or that carry compliance requirements without irreversibility. Sending an internal notification, updating a non-critical record, logging a support ticket state change.

The action executes immediately — no latency introduced, no approval bottleneck. But every execution is recorded in an immutable audit log with the full context: what the agent did, why it did it, what inputs it received, and what output it produced. This is non-blocking oversight: the human can review and rollback if needed, but the default is execution.

The implementation is distinct from HITL — the same decorator or policy applies, but set to type='audit' rather than type='required'. The gate records; it does not block.

Synchronous approval — blocking gate before execution.

Actions that are irreversible, high-blast-radius, or that cross defined policy thresholds. Sending external communications, executing financial transactions, deleting records, modifying production configurations, any action where a wrong result cannot be cheaply corrected.

The agent proposes the action, surfaces its reasoning and context, and halts. The workflow is paused and state is persisted — the agent's work is preserved, nothing is lost. A human reviews via a purpose-built interface showing the proposed action, the agent's reasoning, and the relevant context. They approve, reject, or redirect. The agent resumes or aborts based on the response.

This is the right gate for high-stakes actions. It is not the right default for everything — as a blanket policy, synchronous approval creates an approval bottleneck that defeats the point of autonomous execution. Applied to the specific action types that warrant it, it is the difference between an agent that can be trusted with real authority and one that cannot.

Confidence-based routing — automated triage.

An additional layer that can be applied across any of the tiers above: if the agent's confidence in its decision falls below a calibrated threshold, it escalates regardless of the action category. A routine action that the agent is uncertain about gets routed for human review even if that action type normally runs autonomously.

This requires a confidence signal — either the model's own expressed uncertainty, an output from an evaluation model, or a structured check against policy. The threshold is calibrated, not arbitrary: it should be set based on what error rate the system can tolerate and what the cost of false positives (unnecessary escalations) is relative to false negatives (missed escalations).

Confidence-based routing handles the edge cases that static tier classification misses: the routine action that's being attempted with unusual inputs, the normally-safe tool call that's being invoked with an argument that looks wrong.

The Context Problem: Why Approval Gates Fail Even When They Trigger

The most common failure mode of HITL in practice is not technical. It's informational. An approval gate triggers correctly — the agent pauses, the human is notified, the workflow waits. The human looks at the notification and sees: 'Agent wants to send email to customer.' They approve, because they don't have enough information to do otherwise. The email contains a hallucinated billing detail that causes a complaint.

Without a well-designed context package at the handoff — the proposed action, the agent's reasoning for choosing it, the specific content being sent, what alternatives the agent considered, and what happens if the human declines — human review becomes a rubber stamp.

What the Gate Technically Requires

A working approval gate has five distinct technical components — and is fragile if any of them is missing or poorly implemented.

A risk classifier on the action surface — determining, at runtime, whether a proposed tool call requires approval based on action type, arguments, context, and confidence. Without this, the gate either triggers on everything (unusable) or nothing (pointless).

State persistence during the pause — the workflow checkpoint must be written to durable storage before the approval notification is sent, so the agent's progress survives the potentially hours-long wait for human response. LangGraph's interrupt() handles this via its checkpointer. Without it, a timeout or system restart during the approval window loses the agent's work.

A human notification mechanism — Slack, email, an in-app queue — with routing to someone who is both available and authorized to make this decision.

A review interface with the full context package — not just 'approve/reject' but the proposed action, the agent's reasoning, the specific payload, and the downstream consequences of each choice.

A response path back into the workflow — the human's decision translates into the agent resuming (with approval) or aborting (with rejection), preserving session state in either case.

KEY TAKEAWAYS

HITL is a blocking gate — the agent cannot proceed without a human response. HotL is a monitoring posture. They serve different purposes and aren't interchangeable.
The classification decision — which actions get which kind of gate — is the core design choice. It should be made upfront, not discovered in production.
Four tiers: fully autonomous, audit-log (non-blocking record), synchronous approval (blocking gate), confidence-based routing (dynamic escalation on uncertainty).
A pause without infrastructure is just a wait. Five components make a gate work: classifier, state persistence, notification, context-rich review interface, response path.
The most common HITL failure is informational, not technical — the gate triggers but the human lacks enough context to do anything other than rubber-stamp it.
Calibrated automation is the goal: the right actions proceed autonomously, and the right actions get the human judgment they actually require.

REFERENCES

Human-in-the-Loop: A 2026 Guide to AI Oversight

Strata, May 2026

Agentic AI inverts the traditional review relationship. Previously: AI suggests, human decides. Now: AI executes, human intervenes if they catch it. Without identity governance defining what an agent can do autonomously, HITL checkpoints have no enforcement mechanism. EU AI Act: August 2, 2026 compliance deadline for high-risk AI systems.

Human-in-the-Loop vs Human-on-the-Loop for AI Agents

Waxell, Apr 2026

LangGraph interrupt() pauses graph execution, persists state to checkpointer, resumes only when human response received. MIT Technology Review (Apr 2026): human overseers cannot verify what the AI reasons about internally — the context problem applies to enterprise agents too.

Human-in-the-Loop Escalation Design for AI Agents 2026

Digital Applied, Jun 2026

Escalation is the under-built layer. Evals and observability detect problems; escalation design is the enforcement layer that prevents the irreversible ones. A four-tier action-risk classification, calibration math, and the technical reasons synchronous approval breaks in real infrastructure.

Human-in-the-Loop Patterns for AI Agents (2026)

MyEngineeringPath, Mar 2026

Three approval patterns: pre-action (blocking gate before execution), post-action (review after, rollback possible), confidence-based (automated routing by score threshold). Without full context at the approval gate, approvers make uninformed decisions as quickly as possible — defeating the purpose.

How to add human-in-the-loop controls to AI agents that actually run in production

Agno, Apr 2026

type='required' creates a blocking gate — tool does not execute until administrator approves or rejects. type='audit' is non-blocking — tool executes immediately while a record is created for compliance. Two distinct patterns, both implementable on the same decorator API.

How to Build Human-in-the-Loop Oversight for AI Agents

Galileo, Apr 2026

Synchronous HITL: orchestrator pauses and serializes state, returns a status with invocation identifier, human reviews via UI with full context, session resumes with approval status. Best as a policy-enforcement mechanism for specific high-risk actions, not default operational mode.

Human-in-the-Loop for AI Agents: How Approval Gates Work

BestAIWeb, May 2026

Five components compose a working gate: risk classifier, state persistence, human notification, review interface with context, response path back into agent. Strip any one and the gate either fails to trigger, fails to wait, fails to read the human's intent, fails to give enough context, or fails to translate the decision back into action.

Sirisha’s Substack

Discussion about this post

Ready for more?