An agent cost circuit breaker is an architectural control layer that monitors cost velocity, iteration count, consecutive failures, and scope violations in real time — then terminates execution when thresholds are exceeded, preventing the kind of runaway spend that has produced documented single-incident bills of $437, $47,000, and $2,847 from agents running unsupervised loops. This guide covers the four trigger dimensions, how to implement them at the provider/tool/session level, and why enforcement must live outside agent code at the governance plane.
What Makes Agent Cost Circuit Breakers Different From Traditional Circuit Breakers?
Traditional microservice circuit breakers guard against downstream service failure. They track error rates and open when a dependency is unhealthy. Agent cost circuit breakers track something fundamentally different: behavioral signals that indicate an agent is stuck in a pathological loop — even when every individual API call succeeds. A LangChain agent making 847 identical GPT-4 calls in under a minute isn’t hitting errors; it’s getting successful responses that confuse it into repeating itself. The $47,000 four-agent loop over 11 days (source) involved four agents on the A2A protocol ping-ponging work back and forth, each call succeeding, each response looking legitimate in isolation. Only the aggregate pattern — cost velocity, repetitive tool signatures, zero progress — revealed the pathology.
Standard circuit breaker state machines (CLOSED/OPEN/HALF_OPEN) also don’t account for per-tool isolation. If your web scraping tool is looping but your CRM write tool is fine, a monolithic breaker kills everything. The agent-specific version needs per-tool, per-provider, and per-session breakers operating independently.
The Four Trigger Dimensions Every Agent Breaker Must Monitor
I’ve found that teams implementing circuit breakers for agents consistently miss one or more of these dimensions, leaving blind spots that produce real damage. An effective breaker monitors all four simultaneously.
1. Runaway Loop Detection (Same Tool + Same Args)
The simplest trigger: detect when an agent calls the same tool with the same arguments repeatedly. SHA-256 hash of (tool_name, serialized_args) with a sliding window of 3 identical calls is the baseline implementation. This catches the common failure mode where an agent receives an ambiguous response and retries the exact same call hoping for a different result.
from hashlib import sha256
from collections import deque
class LoopDetector:
def __init__(self, window_size=3):
self.window = deque(maxlen=window_size)
def check(self, tool_name: str, args: dict) -> bool:
h = sha256(f"{tool_name}:{json.dumps(args, sort_keys=True)}".encode()).hexdigest()
self.window.append(h)
return len(self.window) == self.window.maxlen and len(set(self.window)) == 1
2. Cost Velocity (Tokens/sec, $/min)
Cost velocity catches the fires that step-count caps miss. An agent making 8,000 tokens/min against a normal baseline of 500 tokens/min is clearly in trouble, but a static $100 run cap won’t trip until significant damage is done. Velocity tracking uses a rolling window (last 60 seconds) compared against a trailing baseline (last 24 hours or 100 runs). A reasonable threshold is 3× standard deviation above the mean.
Claude Code task budgets provide session-level advisory limits, but they run inside the agent process. Cost velocity enforcement at the gateway layer catches what session-level caps miss by acting on burn rate rather than total burn.
3. Consecutive Failures (Same Operation, N Times)
This one is straightforward and maps most closely to traditional circuit breakers. If the same step fails 3 times consecutively — whether it’s an API timeout, a malformed response, or a validation error — trip the breaker for that tool. The key difference from standard retry logic: a circuit breaker for agents should also classify the error type before tripping. A 4xx response should trip immediately (the request won’t succeed no matter how many times you retry), while 5xx and 429 should count against the retry budget first.
4. Scope Violations
When an agent attempts an action outside its defined permission boundaries — writing to a database it shouldn’t, calling an API it hasn’t been authorized for — the breaker should trip on the first violation. Unlike the other three dimensions, this is an instant open, not a threshold-based decision. The scope allowlist must be enforced at the governance plane, not in the agent’s system prompt (which agents can be instructed to ignore).
Three-Layer Circuit Breaker Hierarchy: Provider, Tool, and Session
Based on the AppScale architecture and my own production experience, the most robust approach is three independent breaker scopes:
| Layer | Scope | Threshold Example | Cooldown | Half-Open Strategy |
|---|---|---|---|---|
| Provider | All calls to a model endpoint | 50% error rate over 20 requests or 60s | 30s + exponential backoff | 10% traffic with simplified prompts |
| Tool | Individual tool (web scraper, CRM write, DB query) | 30% error rate over 10 failures | 30s | Disable tool only, let agent use others |
| Session | Per-user/per-run cumulative state | $50 total spend or 3x velocity spike | Manual reset only | N/A — human review queue |
The provider-level breaker wraps every LLM call. When it opens, fall back to a cheaper model or return a degraded response. The tool-level breaker isolates blast radius — if your web scraper is looping, your CRM integration keeps working. The session-level breaker is the FinOps safety net: it tracks cumulative cost and trip when the total exceeds a configurable ceiling, regardless of per-call health.
Implementation-wise, use Redis-backed state so breaker status survives process restarts. I recommend PyBreaker as a starting point for Python projects, or a simple custom class if you need Redis persistence:
import redis
import time
class RedisCircuitBreaker:
def __init__(self, name, failure_threshold=10, cooldown=30):
self.name = name
self.failure_threshold = failure_threshold
self.cooldown = cooldown
self.redis = redis.Redis()
def call(self, func, *args, **kwargs):
state = self.redis.get(f"breaker:{self.name}:state")
if state == b"OPEN":
cooldown_until = float(self.redis.get(f"breaker:{self.name}:cooldown_until") or 0)
if time.time() < cooldown_until:
raise CircuitBreakerOpenError(f"Breaker {self.name} is OPEN")
self.redis.set(f"breaker:{self.name}:state", "HALF_OPEN")
try:
result = func(*args, **kwargs)
self.redis.set(f"breaker:{self.name}:failures", 0)
if state == b"HALF_OPEN":
self.redis.set(f"breaker:{self.name}:state", "CLOSED")
return result
except Exception as e:
failures = self.redis.incr(f"breaker:{self.name}:failures")
if failures >= self.failure_threshold:
self.redis.set(f"breaker:{self.name}:state", "OPEN")
self.redis.set(f"breaker:{self.name}:cooldown_until", time.time() + self.cooldown)
raise
Retry Budgets: The Overlooked Pattern That Prevents Compound Cost Explosions
This is the single most underrated cost control pattern in the agent ecosystem. Standard per-call retry logic (3 retries with exponential backoff) is dangerously insufficient for multi-step agentic workflows. If each of 8 steps independently retries 3 times, the worst case is 24 calls. But if each of those retries triggers tool sub-calls, the number compounds unpredictably.
A retry budget is a shared pool (e.g., 5 total retries) across the entire workflow run. Step 1 uses 2 retries; remaining steps share the other 3. Crucially, error classification is required — only retry 5xx and 429 (server errors and rate limits), never 4xx. A malformed request won’t fix itself on retry 7.
class RetryBudget:
def __init__(self, max_retries=5):
self.max_retries = max_retries
self.used = 0
RETRIABLE = {429, 500, 502, 503}
def can_retry(self, status_code: int) -> bool:
if status_code not in self.RETRIABLE:
return False
return self.used < self.max_retries
def consume(self):
self.used += 1
Compose retry budgets with circuit breakers: if the retry budget is exhausted on the same step across multiple consecutive runs, trip the tool-level circuit breaker. The three-pattern combo — retry budget + idempotency keys + circuit breaker — is the resilience stack every production agent needs.
Semantic Loop Detection: Catching Reasoning Loops Step Counters Miss
Step counters catch infinite loops but completely miss reasoning loops where the agent makes apparent “progress” each step — just not useful progress. It rephrases the same query, calls the same API with slightly different parameters, generates near-duplicate outputs. Hash-based detection (dimension 1 above) catches exact duplicates. Semantic similarity catches the rest.
The implementation uses difflib’s SequenceMatcher with a 0.85 threshold over a sliding window of 3 outputs — zero additional API cost, runs entirely on the text the agent has already produced. For coding agents, also track test-only loops: if the agent runs tests 3 times without changing any source code, assume stability and stop.
Agent governance frameworks should incorporate semantic loop detection as a mandatory control for any agent operating in production with write-side effects, since reasoning loops are the most common failure mode that existing observability tooling completely misses.
Governance Plane Enforcement: Architecture That Agents Can’t Bypass
This is the most important architectural decision in the entire pattern. Circuit breaker enforcement must live outside the agent’s code — at the AI gateway, governance plane, or infrastructure layer. Budget ceilings, velocity limits, and scope checks embedded in agent prompts or code can be bypassed by a compromised agent or a crashed script that keeps retrying.
The Waxell approach enforces 26 policy categories at the governance plane with no SDK and no agent rebuilds required. The architectural pattern is a middleware chain at the AI gateway:
Each middleware component runs outside the agent process. On violation, it returns a structured enforcement error (HTTP 402 Payment Required or equivalent), logs the full execution context (step count, cumulative cost, trigger metric), and writes a durable audit record that survives session termination. No agent code changes required because enforcement happens at the network layer.
For teams using LiteLLM as their AI gateway, the middleware pattern is straightforward:
from litellm import Router
class CircuitBreakerMiddleware:
def __init__(self, session_tracker):
self.session_tracker = session_tracker
async def pre_call_hook(self, kwargs):
session = kwargs.get("metadata", {}).get("session_id")
cost_velocity = await self.session_tracker.get_velocity(session)
if cost_velocity > 10: # $/min threshold
raise CircuitBreakerError(f"Cost velocity ${cost_velocity}/min exceeded limit")
return kwargs
router = Router(
model_list=[...],
pre_call_hooks=[CircuitBreakerMiddleware(session_tracker)]
)
What Happens After a Trip: Containment Before Diagnosis
When a circuit breaker trips, the response must prioritize containment over diagnosis. Here’s the protocol I’ve landed on after several incidents:
- Stop new executions for the affected scope immediately
- Let in-flight runs finish only if they’re in a safe state (reads allowed, writes blocked)
- Route new work to a human review queue or degraded fallback path
- Log the exact signal that triggered the trip — which dimension, what threshold, the current value
- Notify operators with the trip reason, not just a generic alert
- Require intentional reset — auto-reset is dangerous for write-side or customer-facing failures
Half-open state testing should use canary-style simplified prompts first, gradually increasing complexity. For an LLM provider breaker, send 10% normal traffic with simplified prompts and require 3 consecutive successes before closing. For a tool-level breaker, disable the tool and let the agent operate without it — if the agent succeeds, the breaker stays open until human review confirms the root cause is resolved.
Real-World Cost Data: What Each Pattern Would Have Prevented
| Incident | Cost | Primary Cause | Which Pattern Would Have Prevented It |
|---|---|---|---|
| Single-agent retry loop | $437 overnight | Identical tool calls repeating for 8 hours | Runaway loop detection (dimension 1) + retry budget |
| 4-agent A2A ping-pong | $47,000 over 11 days | Cross-agent work passing without progress | Cost velocity + session-level breaker |
| Image gen runaway | $700 overnight | Flaky API + k8s restart replay loop | Consecutive failure breaker + retry budget |
| GPT-4 847-call loop | $63 | Ambiguous tool response causing identical retries | Hash-based loop detection (dimension 1) |
| LangChain 10K iterations | Unknown (project destroyed) | 8,000 iterations in under 10 minutes | Cost velocity + wall-clock timeout |
The $47,000 case is particularly instructive. The dashboards at the organization showed each agent’s individual spend. A single $47K line item was visible in retrospect. No alert fired because no single run exceeded a static budget cap. The velocity was moderate — spread over 11 days — but cumulative. A session-level breaker with a $1,000 total ceiling would have stopped it on day 1.
FAQ
What’s the difference between a kill switch and a circuit breaker for AI agents?
A kill switch is a manual or time-based stop — it terminates an agent after a fixed duration or when a human hits a button. A circuit breaker is threshold-driven and automatic, responding to behavioral signals (loop detection, cost velocity, consecutive failures). Kill switches are a fallback; circuit breakers are primary prevention. You need both, but don’t confuse them.
How do I choose between provider-level, tool-level, and session-level breakers?
Start with session-level breakers (they’re the FinOps safety net), then add tool-level breakers for high-risk tools (write operations, payment APIs, email send), then provider-level breakers if you’re running at scale with multiple model endpoints. Each layer catches failure modes the others miss.
Can I implement these patterns without a dedicated AI gateway?
Yes. For small deployments, implement the middleware chain inside your agent framework’s request pipeline. AgentBudget is a lightweight open-source SDK that adds per-session dollar-denominated caps without a gateway. For production at scale, the governance plane approach (Waxell, LiteLLM proxy) is more robust because enforcement survives agent crashes.
Should I auto-reset circuit breakers for AI agents?
For read-side operations, a half-open auto-reset with canary testing is fine. For write-side operations (DB writes, email, payments), require human reset. The cost of a false-positive breaker trip is far lower than the cost of a false-negative loop that writes duplicate data or charges customers twice.
How do I tune circuit breaker thresholds for my specific workload?
Start with the default thresholds in this guide, enable verbose logging, and review trip events weekly. After 100 runs, adjust based on your observed baseline: set cost velocity thresholds at 3× your mean burn rate, loop detection at a sliding window of 3, consecutive failures at 3. Review again at 1,000 runs. Threshold tuning is an ongoing operational process, not a one-time configuration.
