Operational failure
Infinite Loop
When an agent repeats reasoning, tool calls, or retries without making meaningful progress.
What failed
An infinite loop occurs when an AI agent repeats reasoning steps, tool calls, retrieval attempts, or retries without making meaningful progress. Some loops are literal; others are bounded by system limits but still waste time, money, and user trust.
Architecture context
Research agents, coding agents, browsing agents, IT automation, customer support agents, retrieval workflows, and systems that automatically retry failed tool calls.
Impact
Loops can create cost runaway, latency spikes, tool-rate-limit problems, bad user experiences, and downstream workflow delays. A loop also indicates that the agent may lack proper stopping conditions, uncertainty handling, or recovery behavior.
Symptoms
- Repeated calls to the same tool with similar arguments.
- Repeated retrieval of similar documents.
- The agent keeps revising without improving.
- The agent retries after the same error.
- Long traces with no state progress.
- User waits while the agent continues unnecessary work.
- Repeated action patterns.
Detection signals
- High retry counts.
- No meaningful state transition.
- Similar tool inputs across steps.
- Cost or latency spikes.
- Agent reaches maximum step limit.
Mitigations
- Add maximum step and retry limits.
- Detect repeated actions.
- Require strategy change after repeated failure.
- Add explicit stop conditions.
- Escalate when progress stalls.
- Budget tokens, tool calls, and elapsed time.