Standard Metrics and Metadata in Cycles
Budget enforcement tells you whether work is allowed and how much it costs.
But production systems need more than cost numbers. They need to know what happened during execution — how many tokens were consumed, how long it took, which model version was used, and any custom data relevant for debugging or analytics.
That is what standard metrics and metadata provide.
Where metrics and metadata appear
Metrics and metadata can be attached to two operations:
- Commits (
POST /v1/reservations/{id}/commit) — when finalizing a reservation - Events (
POST /v1/events) — when recording direct debit usage
Both accept an optional metrics field and an optional metadata field.
Standard metrics
The protocol defines a StandardMetrics schema with four named fields and an extensible custom map:
tokens_input
"tokens_input": 1250The number of input tokens consumed by the operation. Integer, minimum 0.
Useful for tracking prompt size and correlating with model pricing.
tokens_output
"tokens_output": 430The number of output tokens generated. Integer, minimum 0.
Useful for tracking generation length and correlating with output pricing (which is typically higher than input pricing).
latency_ms
"latency_ms": 1840The total operation latency in milliseconds. Integer, minimum 0.
Useful for SLA monitoring, performance analysis, and identifying slow operations that may need different TTL or timeout handling.
model_version
"model_version": "gpt-4o-mini-2024-07-18"The actual model or tool version used. String, maximum 128 characters.
This is important because the model requested and the model used are not always the same. Providers may route to different versions, and this field captures what actually ran.
custom
"custom": {
"cache_hit": "true",
"region": "us-east-1",
"retry_count": "2"
}An open map for arbitrary additional metrics. Values can be any JSON type (strings, numbers, booleans, objects).
Use custom metrics for anything not covered by the standard fields — cache behavior, retry counts, routing decisions, feature flags, or domain-specific measurements.
A complete metrics example
{
"idempotency_key": "commit-run-42-step-7",
"actual": { "unit": "USD_MICROCENTS", "amount": 285000 },
"metrics": {
"tokens_input": 1250,
"tokens_output": 430,
"latency_ms": 1840,
"model_version": "gpt-4o-mini-2024-07-18",
"custom": {
"cache_hit": "false",
"prompt_template": "summarize-v3"
}
}
}Metadata
Metadata is a separate field from metrics. It is an open map for arbitrary key-value pairs:
{
"idempotency_key": "commit-run-42-step-7",
"actual": { "unit": "USD_MICROCENTS", "amount": 285000 },
"metadata": {
"user_id": "user-456",
"session_id": "session-001",
"feature_flag_bucket": "variant-b",
"external_trace_id": "otel-abc-xyz-789"
}
}Metadata is intended for application-level audit, debugging, and correlation to systems outside Cycles — not for operational metrics, and not for the server-managed trace_id and request_id (those flow through first-class response headers and response-body fields; see the next section).
Don't put the server trace_id in metadata
As of v0.1.25 the server manages its own 32-hex W3C Trace Context trace_id and per-request request_id. Both flow on response headers (X-Cycles-Trace-Id) and in response bodies — you don't need to (and shouldn't) shove them into metadata. Use metadata for application-level values like your own OpenTelemetry / Datadog trace id, user id, or session id. See Correlation and Tracing.
Metrics vs metadata vs server correlation identifiers
- Metrics are about what happened during execution (tokens, latency, model version).
- Metadata is about application-level context (user IDs, session IDs, feature flags, external trace ids).
- Server correlation identifiers —
request_id,trace_id,correlation_id— are managed natively by the Cycles server. You do not need to pack them intometadata.
All three are optional. All three are stored with the commit or event record. But they serve different analytical purposes. See Correlation and Tracing for the server's three-tier identifier model.
Where metadata also appears
Metadata is accepted on several other operations beyond commits and events:
- Reservation creation (
POST /v1/reservations) — attach context to the reservation itself - Reservation extend (
POST /v1/reservations/{id}/extend) — attach debugging metadata to extend operations
This means a full reservation lifecycle can carry metadata from creation through commit:
- Create reservation with
metadata: { "trace_id": "..." } - Extend with
metadata: { "heartbeat_seq": "3" } - Commit with
metadata: { "request_id": "..." }andmetrics: { ... }
Metrics in client code
Attach metrics and metadata through the context object inside a decorated function or annotated method. The SDK automatically includes these in the commit request when the function returns.
from runcycles import cycles, get_cycles_context, CyclesMetrics
@cycles(estimate=1000)
def chat(prompt: str) -> str:
response = call_llm(prompt)
ctx = get_cycles_context()
ctx.metrics = CyclesMetrics(
tokens_input=response.usage.prompt_tokens,
tokens_output=response.usage.completion_tokens,
latency_ms=elapsed,
model_version=response.model,
)
ctx.commit_metadata = {
"request_id": request_id,
"trace_id": trace_id,
}
return response.text@Cycles("1000")
public ChatResponse chat(String prompt) {
ChatResponse response = chatModel.call(prompt);
CyclesReservationContext ctx = CyclesContextHolder.get();
CyclesMetrics metrics = new CyclesMetrics();
metrics.setTokensInput(response.getUsage().getPromptTokens());
metrics.setTokensOutput(response.getUsage().getCompletionTokens());
metrics.setLatencyMs(elapsed);
metrics.setModelVersion(response.getMetadata().getModel());
ctx.setMetrics(metrics);
ctx.setCommitMetadata(Map.of(
"request_id", requestId,
"trace_id", traceId
));
return response;
}Why standard metrics matter
Cost attribution
Tokens input and output, combined with model version, enable precise cost attribution:
- which model was used
- how many tokens it consumed
- what the actual cost was
This connects budget accounting to provider-level billing.
Performance monitoring
Latency metrics across commits reveal:
- which actions are slow
- whether latency correlates with budget consumption
- where timeout or TTL adjustments are needed
Audit trail
Metadata creates a traceable path from budget operations back to the originating request, user, or workflow run.
When investigating a budget incident, metadata helps answer: who triggered this, from which session, as part of which trace?
Analytics
Over time, standard metrics enable aggregate analysis:
- average tokens per model call by action type
- latency distributions by model version
- cache hit rates across workflows
- cost efficiency trends
Best practices
Always include tokens and model version on LLM calls
These are the minimum metrics that make budget data actionable. Without them, cost numbers exist without context.
Use metadata for correlation IDs
Attach request_id, trace_id, or session_id to every commit. This makes it possible to join budget data with application logs and distributed traces.
Keep custom metrics stable
Treat custom metric keys like a schema. Changing keys breaks downstream analytics. Add new keys freely, but avoid renaming or removing existing ones without coordination.
Do not put sensitive data in metrics or metadata
Metrics and metadata are stored and may be visible through admin interfaces or log aggregation. Do not include PII, secrets, or authentication tokens.
Summary
Standard metrics and metadata enrich budget operations with execution context:
- tokens_input and tokens_output — token consumption
- latency_ms — operation duration
- model_version — actual model used
- custom — extensible metrics map
- metadata — correlation IDs, audit context, and debugging data
These fields are optional but recommended. They turn budget accounting from raw cost numbers into actionable operational data.
Server-side operational metrics
The metrics above describe the execution-context fields a client attaches to each commit or event. They are stored with the protocol record and surface on commit / event responses and in admin audit trails.
Cycles also exposes Prometheus metrics on each service's /actuator/prometheus endpoint for operational monitoring. These are aggregate counters and histograms — they do not replace per-request metrics, they complement them.
The runtime server (cycles-server v0.1.25.10+) publishes seven domain counters under the cycles_* namespace:
cycles_reservations_reserve_total{tenant, decision, reason, overage_policy}cycles_reservations_commit_total{tenant, decision, reason, overage_policy}cycles_reservations_release_total{tenant, actor_type, decision, reason}cycles_reservations_extend_total{tenant, decision, reason}cycles_reservations_expired_total{tenant}cycles_events_total{tenant, decision, reason, overage_policy}cycles_overdraft_incurred_total{tenant}
The admin server (cycles-server-admin v0.1.25.20+) adds cycles_admin_audit_writes_total{path_class, outcome} — alert on outcome=error nonzero to catch silent audit-coverage loss.
The events service (cycles-server-events v0.1.25.6+) publishes eight webhook delivery metrics under cycles_webhook_* — see Server Configuration Reference → Events service metrics for the full inventory.
The tenant label on all three services is gated by cycles.metrics.tenant-tag.enabled (default true) — set to false in deployments with many thousands of tenants to bound Prometheus cardinality.
Next steps
To explore the Cycles stack:
- Read the Cycles Protocol
- Run the Cycles Server
- Manage budgets with Cycles Admin
- Integrate with Python using the Python Client
- Integrate with TypeScript using the TypeScript Client
- Integrate with Spring AI using the Spring Client