Integrating Cycles with MCP
The Model Context Protocol (MCP) is the standard way AI hosts discover and call tools. The Cycles MCP Server exposes Cycles budget authority as MCP tools, so any MCP-compatible agent (Claude Desktop, Claude Code, Cursor, Windsurf, custom agents) can reserve, spend, and release budget without any SDK integration in the agent's own code.
This guide covers the integration patterns, resources, prompts, and transport options available through the MCP server.
Prerequisites
npm install @runcycles/mcp-server # or use npx at runtimeexport CYCLES_API_KEY="cyc_live_..." # from Admin Server
export CYCLES_BASE_URL="http://localhost:7878" # optionalFor local development without an API key:
export CYCLES_MOCK=trueNeed setup help? See Getting Started with the MCP Server for per-host configuration (Claude Desktop, Claude Code, Cursor, Windsurf).
Pattern 1: Simple reserve/commit
The most common pattern — reserve budget before a costly operation, commit actual usage after:
Step 1 — Reserve:
{
"idempotencyKey": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"subject": { "tenant": "acme", "agent": "researcher" },
"action": { "kind": "llm.completion", "name": "claude-sonnet" },
"estimate": { "unit": "USD_MICROCENTS", "amount": 50000 },
"ttlMs": 60000
}Response includes decision: "ALLOW" and a reservationId.
Step 2 — Execute the LLM call or tool invocation.
Step 3 — Commit:
{
"reservationId": "rsv_...",
"idempotencyKey": "commit-a1b2c3d4",
"actual": { "unit": "USD_MICROCENTS", "amount": 35000 },
"metrics": {
"tokensInput": 1200,
"tokensOutput": 800,
"latencyMs": 2500,
"modelVersion": "claude-sonnet-4-20250514"
}
}The unused 15,000 microcents are returned to the budget pool.
If the operation fails, call cycles_release instead:
{
"reservationId": "rsv_...",
"idempotencyKey": "release-a1b2c3d4",
"reason": "LLM call failed with timeout"
}Pattern 2: Preflight + reserve
Use cycles_decide for a lightweight check before committing to a reservation. Useful at the start of a workflow to decide strategy:
{
"idempotencyKey": "decide-uuid",
"subject": { "tenant": "acme", "workflow": "summarize" },
"action": { "kind": "llm.completion", "name": "claude-opus" },
"estimate": { "unit": "USD_MICROCENTS", "amount": 200000 }
}If the decision is ALLOW, proceed with a full cycles_reserve. If DENY, the agent can switch to a cheaper model or skip the operation — without having locked any budget.
Pattern 3: Graceful degradation
When budget is running low, cycles_reserve may return ALLOW_WITH_CAPS instead of a flat ALLOW. Caps tell the agent how to constrain the operation:
{
"decision": "ALLOW_WITH_CAPS",
"reservationId": "rsv_...",
"caps": {
"maxTokens": 2000,
"toolDenylist": ["web_search", "code_execution"],
"cooldownMs": 5000
}
}The agent should respect these caps:
maxTokens— limit output tokens on the LLM callmaxStepsRemaining— limit remaining agent stepstoolAllowlist/toolDenylist— restrict which tools are availablecooldownMs— wait between operations to slow spend rate
See Caps and the Three-Way Decision Model for details.
Pattern 4: Long-running operations
For operations that may exceed the default 60-second TTL, use cycles_extend as a heartbeat:
Reserve with a TTL:
{
"idempotencyKey": "long-op-uuid",
"subject": { "tenant": "acme", "workflow": "data-pipeline" },
"action": { "kind": "batch", "name": "process-dataset" },
"estimate": { "unit": "USD_MICROCENTS", "amount": 500000 },
"ttlMs": 120000
}Extend periodically (e.g., every 60 seconds):
{
"reservationId": "rsv_...",
"idempotencyKey": "extend-1-uuid",
"extendByMs": 120000
}Commit when done. If the agent crashes, the reservation expires automatically and the budget is returned to the pool.
See TTL, Grace Period, and Extend for the full TTL model.
Pattern 5: Fire-and-forget events
When you can't pre-estimate cost (e.g., webhook-triggered actions, post-hoc metering), use cycles_create_event to record usage directly:
{
"idempotencyKey": "event-uuid",
"subject": { "tenant": "acme", "app": "chatbot" },
"action": { "kind": "llm.completion", "name": "gpt-4o" },
"actual": { "unit": "USD_MICROCENTS", "amount": 42000 },
"metrics": {
"tokensInput": 3000,
"tokensOutput": 1500,
"latencyMs": 1800
}
}No reservation needed — the event is applied atomically to all derived scopes. See Events and Direct Debit.
Pattern 6: Multi-step workflow
For workflows with multiple costly steps, check the balance first, then reserve per step:
Check balance:
{ "tenant": "acme", "workflow": "research-report" }Step 1: cycles_reserve → execute → cycles_commit
Step 2: cycles_reserve → execute → cycles_commit
Step 3: cycles_reserve → DENY (budget exhausted) → degrade or stop
Each step gets its own reservation, so the budget authority can deny mid-workflow if the agent is burning through budget too fast. See Common Budget Patterns for more examples.
Resources
The MCP server exposes resources for inspecting budget state:
| URI | Description |
|---|---|
cycles://balances/{tenant} | Current budget balance for a tenant scope |
cycles://reservations/{reservation_id} | Reservation details by ID |
cycles://docs/quickstart | Getting started guide |
cycles://docs/patterns | Integration patterns reference |
Use resources when you need to inspect state without calling a tool — for example, reading a tenant's balance as context before deciding on a strategy.
Prompts
The server ships three prompts that help AI assistants work with Cycles:
integrate_cycles
Generates Cycles integration code for a given language and use case.
| Parameter | Required | Description |
|---|---|---|
language | No | Programming language (default: typescript) |
use_case | No | Context: llm-calls, api-gateway, multi-agent |
"Use the integrate_cycles prompt to generate Python code for an LLM-calls use case"
diagnose_overrun
Guides through debugging budget exhaustion or a stopped run.
| Parameter | Required | Description |
|---|---|---|
reservation_id | No | Specific reservation to investigate |
scope | No | Tenant or scope identifier to check |
"Use the diagnose_overrun prompt to figure out why my agent stopped — scope is tenant:acme"
design_budget_strategy
Recommends scope hierarchy, budget limits, units, TTL settings, and degradation strategy.
| Parameter | Required | Description |
|---|---|---|
description | Yes | Description of the workflow to budget |
tenant_model | No | e.g., per-customer, per-team, single-tenant |
"Use the design_budget_strategy prompt for my multi-agent customer support system with per-customer tenants"
HTTP transport
For web integrations or remote deployments, run the server with HTTP transport:
npx @runcycles/mcp-server --transport httpThe server starts on port 3000 (configurable via PORT env var) with:
GET /health— health check ({"status": "ok", "version": "0.1.1"})POST /mcp— MCP Streamable HTTP endpointGET /mcp— MCP SSE endpointDELETE /mcp— MCP session cleanup
Error handling
| Error Code | Meaning | Recommended Action |
|---|---|---|
BUDGET_EXCEEDED | Not enough budget | Degrade to cheaper model or stop |
RESERVATION_EXPIRED | TTL elapsed before commit | Re-reserve if work is still needed |
RESERVATION_FINALIZED | Already committed or released | No action needed |
DEBT_OUTSTANDING | Scope has unpaid debt | Wait for admin to fund the budget |
OVERDRAFT_LIMIT_EXCEEDED | Over-limit state | Wait for admin to reconcile |
See Error Codes and Error Handling for the full reference.
Key points
- Zero code changes. Add the MCP server to your agent's config and it gets budget tools automatically via MCP discovery.
- Always finalize reservations. Every
cycles_reservemust be followed bycycles_commitorcycles_release— never leave reservations dangling. - Use unique idempotency keys. Every tool call needs a unique
idempotencyKey(UUID recommended) to ensure exactly-once processing. - Respect caps. When the decision is
ALLOW_WITH_CAPS, constrain the operation accordingly. - Heartbeat long operations. Use
cycles_extendfor operations that may exceed the reservation TTL. - Tag for observability. Use
action.tagsandmetrics.customto add context for debugging and auditing.
Next steps
- Getting Started with the MCP Server — setup guide for each AI host
- Architecture Overview — how the MCP server fits into the Cycles stack
- Cost Estimation Cheat Sheet — pricing reference for estimates
- Troubleshooting and FAQ — common issues and solutions
