Agent Memory Writes Are Actions, Too

A support agent reads a ticket, summarizes it, and saves a note to long-term memory: "Customer accounts under domain acme.com prefer collections-tone follow-ups." The note is wrong. The ticket was unusual, the model overgeneralized, and nothing in the run looked anomalous — the model spend was $0.04, no external tool fired, no email was sent.

A week later, a different agent run for a different acme.com user reads that note out of memory before drafting a reply. It writes the email in collections tone. The send tool is gated and the email is blocked at execution — but the conversation summary it persisted to memory now records that the customer "responded poorly to standard outreach." The next run inherits both notes.

No model call exceeded its budget. No tool tier crossed its threshold. The agent that caused the damage spent four cents. The damage was that it wrote.

Memory writes are actions. Most action authority discussions in the corpus — hard limits on side effects, tool risk scoring, delegation chains — frame actions as outbound side effects: an email leaves the system, a deploy runs, a record is mutated in an external service. Memory writes have a different shape. They persist into the next run's input. The blast radius is forward in time, not outward in space.

Memory Is a Tool That Writes the Future

The shift in 2025–2026 is that memory has moved from a research concept to a deployed primitive. mem0 describes itself as a universal memory layer with explicit add / update / delete operations, and ships integrations across many popular agent frameworks in Python and TypeScript. Letta (formerly MemGPT, renamed in 2024) implements a three-tier model — core memory in the context window, recall memory on disk, archival memory queried as a tool — with the agent itself paging blocks in and out. Zep and several adjacent projects offer graph-shaped memory with similar semantics. Anthropic and OpenAI ship platform-native memory features inside Claude and ChatGPT respectively.

Each of these systems is well-designed for what it is: a layer that captures facts, preferences, and conversation summaries so the agent does not start from zero on every run. None of them is, by itself, a governance system. The memory layer's job is to remember. Deciding whether the next memory write should be remembered is a different layer.

The OWASP Top 10 for Agentic Applications, published December 2025, names this gap explicitly as ASI06: Memory & Context Poisoning. The corpus has touched it in passing — see the framework map in State of AI Agent Governance 2026 — but the runtime-authority treatment has been missing.

What a Memory Operation Actually Does

To classify memory writes the way we classify other actions, it helps to be precise about which operations a memory layer exposes. The names below are conceptual — mem0, Letta, Zep, and bespoke RAG-style stores label them differently — but the operation set across these systems is close to:

Operation	What it does	Persistence	Read amplification
`add` / `upsert`	Insert a new fact, preference, or summary	Until explicitly removed	Every future run that retrieves this scope
`update`	Overwrite or refine an existing fact	Until next update or removal	Replaces prior reads with the new version
`delete` / `forget`	Remove a fact	Permanent within the store	Future runs no longer see it
`pin` / `unpin`	Mark a fact as always-in-context	Long-lived	Loaded on every run, not just retrieval matches
`archive` / `recall`	Move between hot context and cold storage	Hot vs cold	Affects which queries can retrieve it

The properties that make these operations behave differently from outbound tool calls:

The blast radius is temporal, not spatial. A bad email reaches the recipient and stops. A bad memory write reaches every future run that retrieves the affected scope.
The write is read-amplified. One add can be retrieved by thousands of downstream queries. If the scope spans tenants, the amplification crosses tenant boundaries too.
Reversibility is conditional. delete removes the fact only if the system knows the fact is wrong. Many poisoned memories look plausible.
Attribution decays over time. The agent that wrote the bad fact may not be the agent that reads it. Without per-write provenance, "who poisoned the memory" becomes archaeology.
The action does not show up in cost. A 200-token add costs less than a single retry. Cost-shaped budgets see nothing.

These are the properties that turn memory mutations into a structural compromise rather than a transient exploit — the framing OWASP uses for ASI06.

Placing Memory Writes in the Action Tier Model

The tier model from AI Agent Risk Assessment classifies tools 0–4 by blast radius and reversibility. Memory writes do not fit cleanly into any single tier — different operations behave differently, and the scope of the write matters as much as the operation.

A useful first pass:

Memory operation	Scope	Default tier	Rationale
`add` to per-user scope	One user	2 (Write-external, contained)	Affects future runs for one principal
`add` to per-tenant scope	One tenant	3 (Mutation)	Affects every agent run for the tenant
`add` to shared / global scope	All tenants	4 (Execution-equivalent)	Cross-tenant contamination if misclassified
`update` of a pinned core memory	Always-in-context	4 (Execution-equivalent)	Loaded on every run regardless of retrieval
`delete` of a verified fact	Any	3 (Mutation)	Irreversible without backup; affects future retrieval
`archive` (move out of hot context)	Any	1 (Write-local)	Reversible via `recall`; affects retrieval visibility but not stored content

This is a starting point, not a verdict. As with the tool-tier model, the value is in the exercise — forcing the team to ask, for each memory scope they expose, what the worst-case downstream effect of a single bad write looks like. A team running per-tenant memory namespaces has a different tier map than a team running shared embeddings across their whole customer base.

The recent April 2026 mem0 algorithm shift — moving from update/delete semantics toward append-only add extraction, per their published benchmarks — does not eliminate the tiering problem; it changes which operation does the damage. An append-only store with no policy on what gets appended still accumulates poison; the operation tier of add matters more in that model, not less.

What Reserve-Commit Looks Like for Memory Writes

The reserve-commit lifecycle covered in How Reserve-Commit Works in Cycles was originally framed around model spend and external side effects. The same primitive applies cleanly to memory operations — what changes is the unit being reserved.

A memory.add to a per-tenant scope, modeled as an action authority reservation, looks like:

The agent proposes a write — fact text, scope path, source provenance.
The runtime reserves RISK_POINTS against the run's action budget, sized by tier (e.g., 25 points for a per-tenant add).
The runtime returns ALLOW, ALLOW_WITH_CAPS, or DENY. Caps for memory writes might include: max_writes_remaining for the run, allowed_scopes (e.g., per-user only, not per-tenant), or require_provenance: true.
The write executes against the memory layer.
The agent commits the reservation with the actual operation result, or releases it if the write was rejected by the memory store itself.

The shape is identical to a send_email or deploy reservation. The substantive change is what the cap means. tool_allowlist: ["memory.add_user_scope"] becomes the memory equivalent of a read-only fallback: the agent can still write notes about its own user but cannot mutate shared knowledge. tool_denylist: ["memory.update_pinned"] is the equivalent of disabling deploys when risk budget runs low — the operations with the largest blast radius go first.

The progressive narrowing pattern from the action control post applies directly: as the run consumes risk budget, the memory operations available to it shrink, with shared-scope writes lost first and per-user writes lost last.

A RISK_POINTS Schedule for Memory Operations

The point values below are illustrative. Like the tool schedule in AI Agent Risk Assessment, the relative weighting matters more than the absolute numbers, and every team's schedule should reflect its own scope topology.

Memory operation	Scope	Risk points	Comparable outbound action
`read` / retrieve	Any	1	Read-only DB query
`add` to per-run scratchpad	Ephemeral	2	Internal log write
`add` to per-user memory	One user	10	File write
`update` to per-user memory	One user	15	DB record update
`add` to per-tenant memory	One tenant	25	DB mutation
`update` to per-tenant memory	One tenant	30	DB mutation, broad reach
`add` to shared / global memory	All tenants	50	Deploy or config change
`update` to pinned core memory	Always-in-context	50	Permission grant
`delete` of verified fact	Any	25	DB delete

A workflow capped at 100 risk points can do 50 reads and 4 per-user writes, or 4 per-tenant writes and nothing else, or one shared-scope write and a handful of reads. The cap forces prioritization at write time, not after the bad fact has been retrieved a thousand times. For per-tool point assignment in Cycles, see Assigning RISK_POINTS to agent tools.

Why Existing Controls Do Not Govern Memory Writes

A handful of layers in the typical agent stack touch memory in some way. None of them, in their default configuration, acts as a runtime authority for memory writes.

Layer	What it does	What it does not do
Memory layer (mem0, Letta, Zep)	Stores and retrieves facts	Does not decide whether a given write should be persisted under a given run's risk budget
MCP server in front of the memory tool	Brokers tool calls	In its default configuration, sees one call at a time without cross-run amplification context
RAG retrieval reranker	Filters what gets read	Operates at read time; the bad fact is already in the store
Memory hygiene / guard products	Validate hashes, detect tampering	Detect attacks; do not bound authorized but ill-advised writes
Observability / tracing	Records what was written	Records, not enforces

The closest adjacent category is memory-poisoning detection — projects like the OWASP Agent Memory Guard and several commercial products use SHA-256 cryptographic baselines and tamper-detection heuristics to flag attacks. Those defenses cover deliberate poisoning by an attacker. They do not cover a separate failure mode: an authorized agent making an authorized write that turns out to be wrong. Runtime authority is the layer that decides whether the write should happen at all, before the question of whether the write was malicious even arises.

This is the same structural argument made in Agents Are Cross-Cutting. Your Controls Aren't. and MCP Gateways Are Not Runtime Authority: tool-local controls govern themselves, not the agent's cumulative exposure across the dimensions the agent actually spans. Memory is one more such dimension.

Memory Writes Drive Policy Drift

Policy Drift in AI Agents listed memory as one of the surfaces along which approved agent behavior diverges from live behavior. The post named it; this is the mechanism.

An agent approved on day one with a clean memory store behaves differently after 30 days of accumulated facts. Some of those facts are correct refinements. Some are model-generated overgeneralizations. Some are stale — true at the time, no longer true. The agent's effective prompt is the union of its approved system prompt and whatever its retrieval layer pulled from memory. Static policy review, performed once at approval time, cannot bound what that union becomes over time without a runtime check at every write.

Two patterns are useful in combination:

Per-write provenance — every memory entry carries the run ID, agent identity, and risk-budget context at the time of the write. When a downstream incident traces back to a poisoned memory, attribution exists. The byproduct argument in The AI Agent Audit Trail You're Already Building applies here unchanged.
TTL on unverified facts — facts written without independent verification expire automatically unless promoted. Expiration-style mitigations are recommended in several recent agent-security write-ups, including the OWASP ASI06 discussion. The effect is that drift has a half-life: a bad fact written today is gone in 30 days unless something corroborates it.

Neither pattern is a memory layer feature in the typical sense. Both are runtime authority concerns — enforced at write time, recorded at the same layer that decides budgets and tool gates.

Tenant Isolation Becomes a Memory Problem

The multi-tenant arguments in Multi-Tenant AI Cost Control and Agent Identity Is Not User Identity translate directly to memory scopes. A platform that shares a single embeddings collection across all customers has a cross-tenant amplifier: one customer's poisoned write affects retrieval for every other customer whose vectors land in the same neighborhood.

The fixes are familiar from the budget side of the protocol:

Memory scopes mirror scope paths — tenant:acme/agent:support writes only to that scope, retrievals only resolve within it.
Cross-tenant writes are a distinct action class with a higher risk-point cost or a hard DENY.
Audit and attribution are per-scope, not per-store.

A memory layer that does not enforce scopes is the memory equivalent of a budget system that does not enforce per-tenant caps. The damage looks the same: one tenant's bad behavior degrades service for every tenant.

A Short Checklist for Memory Governance

For each memory layer in production, ask:

Are writes tiered? Is a per-user add separated from a per-tenant add from a shared add, with different runtime gates on each?
Are write quotas enforced per run? Can a single runaway agent flood the store, or is memory.add budgeted like any other action?
Is every write attributed? Run ID, agent identity, prompt, retrieval context — enough to trace a downstream incident back to its source.
Do unverified facts expire? TTL on writes that lack corroboration shrinks the drift surface.
Are tenant scopes isolated at the store level, not only at the application level? Application bugs are common; store-level isolation survives them.
Are pinned / always-loaded memories on a separate, stricter gate? A bad pin is loaded on every run; the blast radius is permanent until cleared.

A memory layer can answer few of these questions on its own. The questions are runtime authority concerns, applied to a write surface that most stacks currently treat as silent state.

What Changes When Memory Writes Are Budgeted

Treating memory writes as actions has a few non-obvious consequences.

Memory becomes a budgeted resource at the workflow level, not an infinite background service. A workflow's effective scope is not just which tools can it call but which memory scopes can it write, and how many writes per run. The team designing the agent has to choose which writes are worth the budget, which is roughly the same as choosing which long-lived state the agent actually needs.

Memory drift becomes a measurable, bounded property. Without write quotas, drift accumulates at the rate of usage. With them, drift is bounded by the same risk budget that bounds everything else, and detectable when the budget runs out unexpectedly. Estimate-drift logic similar to the one described in Estimate Drift: The Silent Killer of Budget Enforcement applies in this dimension too.

Incident response gets faster. When an agent starts behaving oddly and the cause is a poisoned memory, the audit trail of recent writes — bounded in size by the per-run quota — narrows the search to a small set of candidates. Without that quota, "what changed in memory" is an open question with no obvious end.

And the model of what an "agent action" is gets a little closer to the real shape of the system. Outbound side effects matter because they reach external parties. Memory writes matter because they reach the next run. Both are actions. Both deserve to be reserved, capped, and committed before they happen — not retroactively analyzed after the next agent has already read what the previous one wrote.

Next Steps

AI Agent Action Control: Hard Limits on Side Effects — the parent action-control framing this post extends
AI Agent Risk Assessment: Score, Classify, Enforce — how to build a risk schedule for your own tool set
Policy Drift in AI Agents — the broader drift surface memory writes belong to
Agent Identity Is Not User Identity — why per-write provenance has to name the agent, not the user
State of AI Agent Governance 2026 — the OWASP ASI06 landing slot for memory & context poisoning
How Reserve-Commit Works in Cycles — the lifecycle that applies, unchanged, to memory operations
Assigning RISK_POINTS to agent tools — the per-tool schedule mechanism, now extended to memory ops

Agent Memory Writes Are Actions, Too ​

Memory Is a Tool That Writes the Future ​

What a Memory Operation Actually Does ​

Placing Memory Writes in the Action Tier Model ​

What Reserve-Commit Looks Like for Memory Writes ​

A RISK_POINTS Schedule for Memory Operations ​

Why Existing Controls Do Not Govern Memory Writes ​

Memory Writes Drive Policy Drift ​

Tenant Isolation Becomes a Memory Problem ​

A Short Checklist for Memory Governance ​

What Changes When Memory Writes Are Budgeted ​

Next Steps ​

More from the Blog