AI Agent Kill Switches Should Be Scoped
The support agent starts sending duplicate emails. The deployment agent is about to run the same rollout plan twice. A research workflow is burning through a tenant's model budget because a vendor API keeps returning ambiguous errors.
The first instinct is understandable: shut the agents off.
That may be the right emergency move. It is also often the most disruptive one if the switch is global. A global kill switch stops the broken workflow, but it also stops healthy tenants, unrelated agents, read-only work, and low-risk fallback paths. It can turn one incident into a broader platform outage.
Production agent systems need emergency controls. They also need scope.
A useful kill switch should answer: stop what, for whom, at which boundary, for how long, and with what audit trail?
Global Stops Are Sometimes Necessary
There are cases where a global stop is justified.
| Situation | Why a broad stop may be reasonable |
|---|---|
| Provider compromise | The dependency itself is not trustworthy |
| Known destructive prompt path | Many agents share the same vulnerable instruction |
| Credential leak | The same key or token is used across workflows |
| Unknown blast radius | Operators cannot identify affected tenants quickly |
| Regulatory hold | The organization needs to stop an entire class of processing |
The problem is not that global stops exist. The problem is treating them as the only emergency control.
Most incidents have a narrower shape. One tenant is over limit. One workflow is looping. One toolset is producing side effects. One webhook destination is failing. One agent identity is compromised. One budget scope is exhausted.
If the control plane cannot express that narrower shape, operators are forced to choose between underreacting and overreacting.
The Scope Ladder
Agent incident response should have a ladder of controls, not one big button.
| Scope | Example control | Use when |
|---|---|---|
| Tenant | Suspend tenant or freeze tenant budgets | One customer or account is affected |
| Workspace / environment | Freeze production but leave staging alone | A deployment environment is unsafe |
| Workflow | Disable or cap one process | One business flow is looping |
| Run | Stop one execution | One active task is misbehaving |
| Agent | Revoke or suspend one agent identity | One actor is compromised |
| Toolset | DENY email.send or refund.issue | Side effects are unsafe |
| Webhook | Pause one destination | Delivery is failing or downstream is unavailable |
| Budget | Freeze or reduce one ledger | Exposure needs to stop immediately |
The more precise the scope, the more likely operators can contain the incident without creating a broader outage.
This is the operational side of scope derivation. If runtime requests carry tenant, workspace, app, workflow, agent, and toolset fields, the platform has boundaries it can use during an incident.
Freeze Is Different from Delete
Emergency controls should preserve evidence.
Deleting an agent, removing a budget ledger, or closing a tenant may be appropriate at the end of an incident. It is usually the wrong first move. Operators need to stop new damage while preserving enough state for investigation.
| Action | Operational meaning |
|---|---|
| Freeze budget | Stop new reservations while preserving ledger state |
| Suspend tenant | Block activity for one tenant without deleting records |
| Revoke key | Stop one credential from authenticating |
| Pause webhook | Stop delivery attempts to one destination |
| Force-release reservation | Clear stuck holds without charging spend |
| Close tenant | Terminal lifecycle operation after decision and review |
This distinction matters because incidents are evidence problems as much as availability problems. You need to know what happened, when it happened, which scope was affected, and whether the control worked.
The Cycles admin docs make this separation explicit. Freezing a budget is an operational control: new reservations stop, history remains, and the budget can be unfrozen after investigation. Tenant close is a terminal lifecycle action with cascade semantics. Those should not be confused.
Runtime Authority Makes the Switch Safer
A scoped kill switch works best when enforcement happens before side effects.
If the only control is a log processor or dashboard alert, the system can discover the incident but not stop the next action. By the time an operator sees the chart, the agent may have already sent the email, issued the refund, or deployed the service.
Runtime authority moves the control point earlier:
Agent proposes action
-> runtime authority checks scope, budget, risk, and status
-> ALLOW, ALLOW_WITH_CAPS, or DENY
-> action executes only if allowedThat gives emergency controls a clean enforcement path. Freezing tenant:acme/workflow:refund does not require every agent framework to learn a new incident API. The next reservation against that scope is rejected, for example with a frozen-budget reason, before the action executes. The agent can stop, degrade, or escalate according to its own error handling.
The action boundary is also where RISK_POINTS help. If email sends, refunds, deployments, and external writes consume different toolset-scoped risk budgets, the platform can stop high-blast-radius actions while allowing read-only work to continue.
Avoid the False Binary
Incident response often gets framed as:
Let the agent continue
or
Turn off all agentsThat binary is too coarse for production.
Better options include:
| Decision | Example |
|---|---|
| ALLOW | Read-only customer lookup continues |
| ALLOW_WITH_CAPS | Continue with a cheaper model or lower token cap |
| DENY one toolset | Stop external emails but allow internal ticket creation |
| Freeze one budget | Stop a workflow that exceeded its run allowance |
| Suspend one tenant | Contain a customer-specific issue |
| Global stop | Last resort when scope is unknown or shared infrastructure is unsafe |
The goal is not to avoid hard stops. The goal is to make hard stops precise enough that they match the incident.
Bulk Actions Need Safety Gates
Sometimes the right scope is not one row. A compromised integration may affect dozens of tenants. A downstream endpoint may require pausing hundreds of webhook subscriptions. A pricing migration may require resetting many budgets at once.
Bulk actions are useful in those moments, but they need guardrails:
- filter-based targeting, not arbitrary "everything"
- expected counts before execution
- idempotency keys for retries
- per-row success, failure, and skipped buckets
- audit entries that capture the filter and outcome
- hard ceilings to prevent accidental fleet-wide changes
That is why the Bulk Actions for Tenants, Webhooks, and Budgets flow is built around preview, propose, execute, reconcile, and audit. The operator needs speed, but speed without a blast-radius check is how an incident response becomes a second incident.
A Practical Runbook Shape
A scoped agent kill-switch runbook can be simple:
- Identify the failing scope. Use balances, events, audit logs, or alerts to find tenant, workflow, agent, or toolset.
- Pick the narrowest effective control. Freeze a budget, pause a webhook, revoke a key, or suspend a tenant.
- Set an idempotency key. Make retries safe and traceable to the incident.
- Record the reason. Preserve operator intent in the audit trail.
- Watch the next decisions. Confirm new actions return DENY or ALLOW_WITH_CAPS at the affected scope.
- Reconcile after containment. Release stuck reservations, repay debt, adjust budgets, rotate keys, or unfreeze.
- Retire only when certain. Delete or close resources after evidence and ownership are clear.
The runbook should be practiced before the incident. Shadow mode is useful here: it can show which scopes would deny actions under proposed policies before those denials affect users.
Where Cycles Fits
Cycles provides the runtime and admin primitives behind this pattern:
| Need | Cycles surface |
|---|---|
| Stop new spend or risk at one boundary | Budget freeze / budget reduction |
| Limit exposure before action execution | Reserve-commit lifecycle |
| Express narrow blast radius | Tenant, workspace, workflow, agent, and toolset scopes |
| Preserve operator evidence | Audit logs and events |
| Handle repeated operation safely | Idempotency keys |
| Act on many affected rows | Bulk actions |
| Roll out without breaking production | Shadow mode and dry-run decisions |
That does not replace the rest of incident response. You still need alerting, ownership, credentials, runbooks, and post-incident review. Runtime authority gives those processes an enforcement point that is precise enough to be operationally useful.
The Takeaway
AI agent kill switches should not be only global.
A global stop is a necessary last-resort control, but many incidents can use a narrower response: freeze one budget, suspend one tenant, pause one webhook, revoke one key, deny one toolset, or cap one run.
Scoped controls let teams contain damage without converting every agent incident into a platform outage.
Sources
- Microsoft Security Blog: Zero Trust for AI - zero trust framing for agent behavior and security lifecycle
- OWASP Top 10 for Agentic Applications 2026 - agentic risk framework for autonomous systems
- Using Bulk Actions for Tenants, Webhooks, and Budgets - filter-based fleet operations with safety gates
- Budget Allocation and Management in Cycles - freeze, fund, reset, and budget lifecycle operations
- Tenant-Close Cascade Semantics - terminal tenant lifecycle behavior and mutation guards
- Deploying the Cycles Dashboard - operational surface for freeze, suspend, revoke, pause, and force-release actions
- Shadow Mode in Cycles - rollout pattern for testing enforcement before blocking