How to Add Budget Limits to an MCP Server
An MCP server starts as an integration convenience.
You expose search_docs, lookup_customer, send_email, issue_refund, or run_report. The agent host discovers those tools and calls them through a standard protocol instead of a custom integration.
That is the easy part.
The production question comes next: which of those tool calls should still run after the agent has already searched 20 times, retried a failing API, or crossed the customer's budget?
MCP standardizes access to tools. It does not, by itself, decide whether the next tool call is still inside the allowed budget. To add hard budget limits, the MCP server has to put a budget decision in the execution path before the tool handler creates cost or side effects.
This post is the rollout checklist.
1. Decide what kind of MCP server you are operating
There are two common patterns, and they solve different problems.
| Pattern | What it does | Budget limit behavior |
|---|---|---|
| Cycles MCP server | Exposes Cycles tools such as cycles_reserve, cycles_commit, and cycles_release to an MCP-compatible agent | Cooperative: the agent can call Cycles tools, but other tools are not automatically gated |
| Your application MCP server | Exposes your business tools, data access, or side effects | Enforced when each business tool handler requires a budget decision before it runs |
The Cycles MCP server quickstart is the fastest way to let an MCP-compatible agent participate in budget workflows. That is useful for development, evaluation, and agent-assisted integration work.
For hard production enforcement, the budget check must sit on the action path. If send_email is the risky tool, then the send_email handler needs to require reserve or decide before sending the email. If the agent can call a second email tool that bypasses that check, the budget is not actually enforcing the action.
The Cycles MCP integration guide makes the same distinction: exposing budget tools through MCP is useful, but preventative control requires the costly or risky action to depend on the budget decision.
2. Classify the tools before setting budgets
Not every MCP tool needs the same limit.
Start by grouping tools into practical classes:
| Tool class | Examples | First budget to add |
|---|---|---|
| Read-only lookup | search_docs, lookup_invoice, fetch_ticket | Token, request, or small spend budget |
| Paid external call | web search, enrichment API, model call | Spend or credits budget |
| Customer-visible side effect | send_email, post_comment, create_ticket | Credits, spend, or risk budget |
| Financial action | issue_refund, adjust_invoice, apply_credit | Small RISK_POINTS budget plus approval for higher-risk cases |
| Infrastructure action | deploy, shell, data migration | Deny by default, then explicit narrow allowance |
This classification does not need to be perfect on day one. It needs to separate harmless read paths from actions that can create cost, customer impact, or operational risk.
The action-control version of this model is covered in Action Authority: Controlling What Agents Do. The budget-control version starts with the same inventory: what can this tool do, and what could go wrong if it runs repeatedly?
3. Put reserve before the handler
For hard limits, the core flow is:
MCP tool call proposed
↓
reserve budget for this tenant, run, toolset, and estimate
↓
ALLOW / ALLOW_WITH_CAPS / DENY
↓
execute handler only if allowed
↓
commit actual usage on success
release reservation on failure or cancellationThe important property is ordering. The handler runs after the budget decision, not before it.
That gives the MCP server a clear failure mode:
- If the reserve call returns
ALLOW, execute normally. - If it returns
ALLOW_WITH_CAPS, apply the cap before execution when the tool supports a smaller limit. - If it returns
DENYor a budget-exceeded error, do not call the handler. - If the handler fails, release the reservation.
- If the handler succeeds, commit actual usage.
The protocol details are in How Reserve-Commit Works. For a concrete TypeScript wrapper, use Add Hard Budgets to MCP Tools Before They Execute as the implementation companion to this checklist.
4. Scope the budget to the thing you actually want to protect
Most budget mistakes come from choosing the wrong boundary.
An organization-level cap may be too broad. A per-tool cap may be too narrow. A per-user cap may be useful for attribution but insufficient for multi-tenant enforcement if every call still lands against the same shared gateway key.
For MCP servers, useful budget boundaries are usually:
- Tenant: one customer cannot exhaust another customer's allocation.
- Workspace or environment: staging and production have different risk.
- Workflow: support triage gets a different budget from invoice reconciliation.
- Agent run: one runaway conversation cannot consume the whole tenant budget.
- Toolset: email, refund, search, and code execution have different risk.
The HTTP MCP server guide calls out an important deployment detail: the MCP server's Cycles API key is the gateway's identity. Per-user attribution can be carried as audit context, but enforcement still depends on mapping the request to the right tenant and scope before calling Cycles.
If the scope is wrong, the budget decision may be technically successful and operationally useless.
5. Use small estimates first, then tune from actuals
Budget limits need estimates before execution and actuals after execution.
For deterministic tools, estimates can be simple:
send_email: fixed credits or RISK_POINTS estimate.lookup_customer: small fixed credits estimate.issue_refund: risk score tied to amount band.
For variable-cost tools, start conservative:
- Model calls: estimate tokens or microcents from the requested model and max output.
- Search tools: estimate by maximum result count or provider price.
- Batch actions: estimate per item, then cap the item count.
After success, commit the actual usage. That feedback loop is what lets operators tune budgets without guessing forever. If a tool regularly commits much less than it reserves, lower the estimate. If it regularly commits more, raise the estimate or cap the request.
For unit choices, see Understanding Units in Cycles.
6. Test denial before trusting the rollout
A budget limit is not real until you have watched it deny a tool call before the handler executes.
Use a small test budget and a harmless tool first:
- Create a tenant or test scope with a small budget.
- Configure one MCP tool handler to reserve before execution.
- Call the tool until the budget is exhausted.
- Confirm the next call is denied before the handler runs.
- Confirm successful calls commit actual usage.
- Confirm failed calls release reservations.
Then test the operator path:
- Does the agent degrade gracefully?
- Does the user get a useful failure message?
- Does the audit trail show the denied action?
- Can the operator tell which tenant, workflow, agent, and toolset consumed the budget?
The first-rollout guide helps choose whether to start with tenant budgets, run budgets, or model-call guardrails. For MCP servers, the best first test is usually one paid or customer-visible tool with a small budget and a clear denial path.
7. Avoid the two common false starts
False start 1: Ask the agent to call budget tools voluntarily.
That can help during evaluation, but it is not hard enforcement. If the business tool still works when the agent skips cycles_reserve, the budget is advisory.
False start 2: Log usage after the tool runs.
Post-hoc events are useful for audit, reporting, and tuning. They are not preventative. If the goal is to stop the next side effect, use decide or reserve before execution.
Those two false starts are why MCP budget limits belong in the handler, gateway, harness, or service boundary that the tool must pass through.
Resource links
- Cycles MCP server quickstart — expose Cycles budget tools to MCP-compatible agents.
- Cycles MCP integration guide — patterns, resources, prompts, and transport options.
- Running the MCP server over HTTP — shared remote MCP gateway deployment notes.
- Add Hard Budgets to MCP Tools Before They Execute — implementation companion with a TypeScript wrapper.
- How Reserve-Commit Works — lifecycle reference.
- Understanding Units in Cycles — monetary, token, credit, and risk units.
- Model Context Protocol documentation — official MCP introduction.