AI Agent Risk & Blast Radius: A Production Reference
Cost is one dimension of runtime authority. The other — and the one that tends to produce the most damaging incidents — is action authority: bounding what an agent is permitted to do, not just how much it can spend doing it. A single agent action — a deploy, a refund, a deletion, an email blast to the wrong list — can cost more in damage than the agent's entire month of LLM bills. This guide covers risk scoring, blast-radius containment, and the patterns that keep a single mistake from cascading into an incident.
This is the action / damage dimension of runtime authority. For the cost / spend dimension, see LLM Cost Runtime Control Reference. For the full product framing across all dimensions, see Why Cycles.
Quantify it for your agent. Open the blast-radius risk calculator → — name your agent, define its action classes by reversibility and visibility, and see the monthly blast radius. Share the configured view with a teammate.
If you are debugging a live action incident, jump straight to the Incident Patterns catalog.
Why action authority is structurally different from cost control
Cost is a continuous variable: every call shaves a fraction of a budget. Action damage is discrete and sometimes irreversible. A 100,001st LLM call costs 0.01% more than a 100,000th. A first deletion of the wrong table costs everything. Tools built around cost curves — alerting, monitoring, and rate limits — are weak against this second class because action damage is often discrete, not gradual.
- What Is Runtime Authority for AI Agents? — the foundational concept
- Beyond Budget: How Cycles Controls Agent Actions — why budget alone is insufficient
- Runtime Authority vs Guardrails vs Observability — the three layers, what each does, what each cannot do
Action authority: the core concept
Action authority is the runtime decision: "given who this agent is, what it has already done, and what it is asking to do now — should this action be allowed?" That decision happens before the side effect, not after.
- Action authority: controlling what agents do — the protocol-level model
- Runtime authority vs authorization — the distinction from per-request auth
- AI Agent Action Control: Hard Limits on Side Effects
- AI Agent Runtime Permissions: Control Actions Before Execution
- AI Agent Action Authority: Blocking a Customer Email Before Execution — concrete walkthrough
Risk scoring: not all actions are equal
Reading a file is not the same as sending a refund or executing arbitrary code. Risk needs to be quantified per action class so authority decisions can be made by risk, not by call count.
- AI Agent Risk Assessment: Score, Classify, Enforce Tool Risk
- Assigning RISK_POINTS to agent tools — the implementation pattern
- Understanding units in Cycles — how risk points sit alongside USD and tokens
Blast radius: containing damage when something does fire
Even with risk scoring and authority gates, things will fail. Blast-radius design asks: when an agent does something wrong, how far can the damage propagate before it is contained? Per-tenant boundaries, per-run budgets, and per-action caps each cap a different blast direction.
- Multi-tenant SaaS guide — tenant isolation as blast-radius containment
- Multi-Tenant AI Cost Control: Budgets and Isolation
- Concurrent agent overspend — the canonical concurrency-blast incident
High-risk action classes
Some action classes carry asymmetric damage. Each deserves an explicit policy at the authority layer rather than inheriting the default for every other tool the agent has.
- Cursor AI Agent Reportedly Deleted a Production Database in 9 Seconds — the canonical disaster
- AI Agent Silent Failures: Why 200 OK Is the Most Dangerous Response — the failure mode that hides damage
Degradation paths: deny, downgrade, disable, defer
When the authority layer says no, what should the agent do next? Outright denial is sometimes correct, but in many cases a graceful degradation — a smaller model, a smaller scope, a deferred action, a disabled feature — preserves the user experience while bounding the risk.
- When Budget Runs Out: AI Agent Degradation Patterns
- Degradation paths: deny, downgrade, disable, defer
- Caps and the three-way decision model — protocol-level decision shapes
Authority attenuation across delegation chains
Agent systems often delegate to sub-agents or tool-using sub-routines. A naive design propagates trust forward; a safe design attenuates authority — each layer gets less than the layer that called it, and the policy decision is made fresh at each boundary.
- Agent Delegation Chains Need Authority Attenuation, Not Trust Propagation
- Zero Trust for AI Agents: Why Every Tool Call Needs a Policy Decision
- Agents Are Cross-Cutting. Your Controls Aren't. — why per-service auth does not bound a multi-service agent
Why framework guardrails are not enough
Agent frameworks (LangChain, LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, etc.) provide orchestration primitives, content guardrails, middleware, and tool-calling patterns. What they usually do not provide is a cross-agent, cross-tenant, ledger-backed runtime authority layer for budget, risk, and action decisions. The policy decision still has to happen, and it has to happen outside the agent loop.
- OpenAI Agents SDK: Content Guardrails, No Action Control
- MCP Tool Poisoning Has an 84% Success Rate — Why Agent Frameworks Still Can't Prevent It
Identity, keys, and least privilege
Every agent action ultimately resolves to a credential at the call site. Authority bounds what an agent is allowed to attempt; least-privilege keys bound what the underlying API will let the agent do if it tries. They are complementary layers.
Audit trail and attribution
Authority decisions create an audit trail by side effect: every allow/deny, every degraded action, every reservation that was made and committed or rolled back. That audit is the substrate for compliance, post-incident review, and tenant-level reporting.
Compliance and governance frameworks
Risk and blast-radius design is not just an engineering concern. NIST AI RMF, the EU AI Act, ISO 42001, and OWASP guidance increasingly push teams toward demonstrable controls, traceability, and evidence — not just intent or policy documents.
- The AI Agent Governance Framework: Mapping NIST, EU AI Act, ISO 42001, and OWASP
- State of AI Agent Governance 2026
Rolling out enforcement without breaking production
Adding an authority gate to an existing agent system is the riskiest deployment. Shadow mode lets the gate run in observe-only mode against real traffic so policies can be calibrated before any action is denied.
Incidents this is built to prevent
Most agent damage clusters into a small number of patterns. Recognizing the pattern is half the work.
The complement guide
This guide focuses on what an agent is allowed to do. For what an agent is allowed to spend, see LLM Cost Runtime Control Reference. Cost and action are the two dimensions of runtime authority — most production incidents touch both.