Loop Engineering: Agent Loop Stopping Conditions: The Invisible Failure Mode
Every agent loop tutorial teaches you how to start a loop. Almost none teach you how to end one correctly. That gap is not a pedagogy problem; it is a production risk.
Every agent loop tutorial teaches you how to start a loop. Almost no one teaches you how to end one correctly. That gap is not a pedagogy problem; it is a production risk.
Loops that can start but cannot stop reliably are not production-ready. The stopping conditions are not optional safety features. They are load-bearing infrastructure, as critical as the tool invocation logic itself. And they are exactly what gets skipped when teams are in a hurry to ship.
The Token Cost Nobody Budgets For
The most visible failure mode for an uncontrolled AI agent loop is runaway token cost. A loop that does not converge will keep issuing tool calls and processing responses until something external interrupts it: a timeout, a budget cap, or the shock of an invoice.
The insidious part is that many non-convergent loops look productive in the short term. The AI agent is doing things. It is calling tools, generating responses, making progress toward something. The loop is running; the dashboard is green. The problem only becomes visible when the token count keeps climbing, and the work is never actually finished.
Token burn is the obvious failure. The subtler one is stuck agents.
Stuck AI Agents and Indefinite Retries
A stuck AI agent is not always a crashed agent. More often it is a loop that is technically running (issuing calls, receiving responses) but making no forward progress. The agent has hit a wall: a tool that returns the same error, a dependency that remains unsatisfied, a task decomposition that keeps producing the same subtask. Without a repetition guard, the loop keeps retrying because nothing in the loop architecture tells it to stop.
You kill the process manually in local testing. In production, the loop does the billing. The loop runs until the iteration budget or token budget is exhausted, which means you pay for every one of those identical, useless calls.
Thank you for reading and helping these ideas reach more builders.
False Completeness: The Worst AI Agent Loop Failure Mode
The most damaging failure mode is not runaway cost or indefinite retry. It is false completeness.
False completeness occurs when the AI agent reports that the task is finished, and it is not. The agent says “done,” returns a success signal, and the orchestrator moves on. The downstream system receives a half-built artifact: code with placeholders, a repository that does not compile, a report with empty sections labeled “TODO.”
This failure is particularly dangerous because it is invisible at the boundary. The loop terminated cleanly. The exit code was zero. The agent said it succeeded. Nothing in the system flagged an error. The work was not done.
How does this happen? The answer is structural, and it points to the core design problem this article addresses.
The Maker/Checker Problem in Agentic AI
Same-model self-verification is structurally biased toward optimism. When the AI agent that produced the output is also the agent asked to verify that output, the two share the same world model, the same biases, and, critically, the same blind spots. If the agent did not understand the requirement well enough to complete it correctly the first time, it is unlikely to catch its own misunderstanding on review.
In software engineering terms, this is the same reason you do not let developers merge their own pull requests without review. The maker and the checker must be different things.
The failure mode, which the author calls false completeness, is observed repeatedly in engineering practice and confirmed by source code analysis; it is a predictable consequence of same-model self-verification. Half-built code. Placeholders shipped as deliverables. Non-compiling repositories reported as complete.
Stopping Conditions Are Load-Bearing Infrastructure for AI Agents
An AI agent that cannot stop correctly has three observable symptoms: it burns tokens without converging, it retries without making progress, and it declares success before the work is finished. Each symptom has a corresponding stopping primitive that addresses it:
Runaway cost → iteration limit (hard cap)
Indefinite retry → repetition detection (argument-aware loop guard)
False completeness → external completion verification (the maker/checker doctrine)
The rest of this article builds each of those primitives from source-code-verified implementations, then shows how modern frameworks encode them at the platform level.
The Three Primitives Every Agentic AI Loop Needs
Three stopping conditions cover the failure modes you will actually hit in production. They are not exotic or experimental; they are source-code-verified implementations from a well-studied open-source AI agent loop. Build all three into every loop you ship. Do not rely on any one alone.
The reference implementation throughout this section is AlessandroAnnini/agent-loop. The primitives described below have been verified directly from the source code.
Primitive 1: The Iteration Limit (Hard Cap for Every AI Agent Loop)
The iteration limit is the non-negotiable safety floor. Every production AI agent loop needs one.
In agent-loop, the limit is configured via the --max-iterations CLI flag. The default is defined in constants.py:
DEFAULT_MAX_ITERATIONS = 20 # ①① DEFAULT_MAX_ITERATIONS is the hard cap used by agent-loop when no --max-iterations flag is provided; set it explicitly for every production loop.
When the loop reaches this count, it terminates unconditionally. It does not matter whether the AI agent thinks the task is complete. It does not matter whether there are pending tool calls. The loop stops.
That unconditional quality is the point. The iteration limit is not a soft suggestion; it is a hard cap that terminates the loop regardless of state. It is the last line of defense against runaway loops when all other stopping conditions have failed to fire.
The default of 20 is a reasonable starting point, but the right value depends on the task. Complex multi-step tasks may legitimately need more turns. Simple retrieval tasks should need far fewer. What matters is that a limit exists and is enforced.
Every AI agent loop must have an iteration limit. If your framework does not provide one by default, set one explicitly.
The iteration limit catches runaway loops, but it cannot tell you whether the work is done: that is what Primitive 2 addresses, and it still will not catch stuck agents making zero forward progress.
Primitive 2: Completion Detection (Semantic Termination for AI Agents)
The iteration limit terminates the loop at a hard boundary. Completion detection terminates it semantically: when the AI agent signals the task is complete.
In agent-loop, completion detection is implemented in detect_completion_signals() in loop_control.py. The function checks two independent signals:
Signal A: Completion phrases. The function scans the agent’s most recent response for phrases indicating the task is finished: “task complete,” “work is done,” and similar natural-language signals. If a completion phrase is present, the loop terminates cleanly.
Signal B: Brief response heuristic. If the response is under 100 characters AND no tool calls are present, the function treats this as an implicit completion signal. The reasoning is structural: a very short response with no pending tool calls is almost always a wrap-up message rather than a mid-task status update. AI agents in the middle of work produce long, tool-call-heavy responses.
Both signals are checked on every turn. Either one is sufficient to trigger a clean stop.
The two-signal approach is intentional. Completion phrases are explicit but depend on phrasing. The brief-response heuristic is implicit but catches cases where the agent ends its work without using a canonical phrase. Together they provide broader coverage than either signal alone.
One note on tuning: completion phrase lists should be extended for domain-specific loops. An agent that writes SQL queries may signal completion differently than one that manages file operations. The heuristic thresholds (100 characters, zero tool calls) may also need adjustment for loops where short affirmative replies are common mid-task.
Completion detection handles voluntary exits cleanly, but a stuck agent that never signals completion will exhaust the iteration limit before it fires. That is the failure mode Primitive 3 is designed to catch first.
Primitive 3: Repetition Detection (Stopping Stuck AI Agent Loops)
Repetition detection is the guard against the stuck-agent failure mode. It identifies when the loop is making no forward progress by detecting repeated calls to the same tool with the same arguments.
In agent-loop, repetition detection is implemented in detect_repetitive_behavior() in loop_control.py. The mechanism is deterministic and cheap:
# Conceptual representation of the core mechanism
import hashlib
def hash_tool_call(tool_name: str, arguments: dict) -> str:
payload = f”{tool_name}:{json.dumps(arguments, sort_keys=True)}” # ①
return hashlib.sha256(payload.encode()).hexdigest() # ②① Serializing the tool name and arguments as a canonical string (with sort_keys=True) ensures the hash changes whenever either the tool or its inputs change, even if argument order differs between calls. ② SHA-256 hashing of the serialized payload produces a compact, deterministic fingerprint; matching hashes on five consecutive turns signals a stuck agent.
On every turn, the function computes a SHA-256 hash of each tool call’s arguments. It then checks whether the same hash has appeared five or more consecutive times. If it has, the loop is stuck: it is calling the same tool with the same inputs, expecting a different output from an operation that has already returned the same result multiple times.
The threshold is 5 consecutive identical calls. That number is intentionally not 1 or 2. A single repeated call might be a legitimate retry. Two or three might be reasonable backoff behavior. Five consecutive identical calls with identical arguments are a strong signal that the loop will not converge on its own.
The Critical Nuance: Argument-Aware Repetition Detection in AI Agents
Repetition detection is argument-aware. The same tool called with different arguments does not trigger it.
This distinction matters enormously. An AI agent traversing a filesystem will call the same tool (list_directory, read_file) many times, but with different paths each time. Argument-naive repetition detection would incorrectly halt that loop. Argument-aware detection correctly recognizes it as legitimate forward progress.
SHA-256 hashing of the argument payload provides this argument awareness efficiently and deterministically. The hash changes whenever the arguments change, even if the tool name stays the same.
These three primitives address whether the loop terminates; not whether the work it produced is correct. That gap is the maker/checker problem, and it requires a different approach entirely.
The Optional Fourth Layer: Human-in-the-Loop Gates for Agentic AI
A fourth mechanism in agent-loop is worth noting, even though it is not a stopping condition in the strict sense: the --safe flag.
When --safe is active, the loop requires explicit y/N confirmation from a human operator before any tool executes. This is not a per-loop gate; it is a per-tool gate. Every individual tool call must be approved.
This is human-in-the-loop mode (think code review for every tool call, not just the dangerous ones). It converts the autonomous AI agent loop into a supervised one. The loop does not stop, but no tool fires without human review. For high-stakes operations where the cost of an incorrect tool call is high, this is the appropriate posture.
Layering the Stopping Primitives: Defense in Depth for AI Agents
Ship all three primitives together. The iteration limit catches everything the other two miss. Completion detection exits cleanly when the work is done. Repetition detection catches the stuck-agent case before the iteration limit is reached.
The primitives map directly to failure modes:
Iteration limit addresses token burn and runaway AI agent loops with a hard, unconditional cap.
Completion detection addresses premature continuation with semantic phrase matching and a response-length heuristic.
Repetition detection addresses stuck AI agents with SHA-256 hashing of tool arguments.
Defense in depth starts here. The next section addresses the problem these three primitives do not solve: verifying that the agent’s completion judgment is actually correct.
The Maker/Checker Doctrine: Never Let the AI Verify Its Own Done
The three primitives handle the mechanics of terminating the AI agent loop. They bound the iteration count, detect completion phrases, and identify stuck agents. What they do not do is verify whether the work is actually correct.
That gap is the maker/checker problem. And it requires a different approach entirely.
The Structural Flaw in Self-Verifying AI Agents
The entity that produces output must never be the same entity that certifies that output as complete.
This is the maker/checker doctrine. It is not a new principle; software engineering has relied on it for decades. Code review exists because the developer who wrote a function has the same mental model of what the function should do and, therefore, the same blind spots in where it can fail. Unit tests exist because the compiler cannot catch logical errors; you need a separate verification artifact.
The same logic applies to AI agent loops, but the cost of failure is higher. The loop can emit its success signal, terminate cleanly, and deliver a broken artifact without anyone catching it at the boundary.
What False Completeness Looks Like in Agentic AI
If you are a paid subscriber, thank you. Your support makes this work possible.
If you are a free subscriber and find these articles useful, please consider upgrading. A paid subscription is $80 per year or $8 per month.





