SL#67 - AWS AI Series (9/30) - Agents That Remember: Persist User Preferences Across Sessions with AgentCore Memory

What we are building

In episodes 7 and 8 we built RepoScout, a Strands agent with custom tools, and shipped it to AgentCore Runtime. It worked, but it had amnesia. Every invocation started from nothing. Tell it on Monday that you write Python and care about type hints, and on Tuesday it has no idea who you are.

Today we fix that. We are going to build a small developer assistant that remembers your preferences across sessions. You tell it "I work in Python and I hate unpinned dependencies" in one run of the script, kill the process, start a fresh run with a brand new session id, and the agent still answers as if it knows you. The whole thing is roughly 60 lines of Python.

The non-obvious design choice is that we are not going to write memory plumbing ourselves. AgentCore Memory does two jobs. Short-term memory stores raw conversation turns, the literal back-and-forth, scoped to a session. Long-term memory runs a background extraction process over those turns and distills durable facts, preferences, and summaries into searchable records that survive across sessions. The AgentCoreMemorySessionManager from the AgentCore SDK wires both into a Strands agent for you: it writes every turn to short-term memory automatically, and before each model call it semantically searches long-term memory and injects the relevant records into the agent's context. You write an agent. It remembers. That is the whole pitch, and the rest of this post is making it real.

Prerequisites

You need an AWS account with credentials configured (aws configure), Python 3.10 or newer, and model access enabled for at least one Bedrock model in your region. We will use us-east-1 throughout because that is where the examples in the official docs run, but any region with both Bedrock and AgentCore Memory works. Set your region once with export AWS_REGION=us-east-1 so boto3 and the SDK agree.

You should be comfortable reading Python and have invoked a Bedrock model at least once (episode 1 covered the Converse API if you need a refresher). Having finished episode 7 helps because the Strands Agent object is the same, but it is not required. If you skipped it, the five-minute version is: Strands is AWS's open-source agent framework where you instantiate Agent(...) and call it like a function. That is all you need to follow along.

One thing that is easy to miss in prerequisites: long-term memory extraction is asynchronous and not instant. After you write a conversation turn, the extraction pipeline takes time, often a couple of minutes, before a durable preference record shows up in long-term memory. Budget for that wait when you test. It is not a bug, it is the design, and we will come back to it in the troubleshooting section.

Setup

Make a clean project and install the SDK with the Strands integration extra. The [strands-agents] extra pulls in everything needed to bind AgentCore Memory to a Strands agent.

mkdir agentcore-memory-agent && cd agentcore-memory-agent
python -m venv .venv
source .venv/bin/activate
pip install 'bedrock-agentcore[strands-agents]'
pip install bedrock-agentcore-starter-toolkit
export AWS_REGION=us-east-1

Before writing any agent code, prove your credentials and model access work. This one-liner calls the Bedrock control plane and lists the foundation models you can see. If it prints model ids, you are good. If it throws AccessDenied or UnrecognizedClient, fix your credentials before going further.

aws bedrock list-foundation-models --region us-east-1 \
  --query 'modelSummaries[0:3].modelId' --output text

You also need IAM permissions for AgentCore Memory. For local development, a policy granting the AgentCore Memory actions plus model invocation is enough. Attach something like this to the identity you are running as, and tighten it for production later (the AgentCore IAM docs list every action if you want least privilege):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "bedrock-agentcore:*",
      "Resource": "arn:aws:bedrock-agentcore:us-east-1:*:memory/*"
    },
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": "*"
    }
  ]
}

The bedrock-agentcore service prefix covers both the control plane (creating and deleting memory resources) and the data plane (writing events, retrieving records). The bedrock:InvokeModel permissions are for the agent's own model calls and for the managed extraction models that turn raw turns into long-term records.

Step 1: Create a memory resource with strategies

A memory resource is the top-level container. On its own it stores short-term events. To get long-term memory, you attach one or more strategies at creation time. A strategy is a managed pipeline that decides what to extract from raw turns and where to file it. AgentCore ships three built-in strategies: userPreferenceMemoryStrategy learns preferences, semanticMemoryStrategy extracts factual statements, and summaryMemoryStrategy keeps a running summary per session.

Memory creation is a one-time setup step, not something your agent does on every run. Do it once in a setup script and reuse the id. Run this:

import os
from bedrock_agentcore.memory import MemoryClient
client = MemoryClient(region_name="us-east-1")
memory = client.create_memory_and_wait(
    name="DevAssistantMemory",
    description="Remembers developer preferences and facts across sessions",
    strategies=[
        {"userPreferenceMemoryStrategy": {
            "name": "PreferenceLearner",
            "namespaces": ["/preferences/{actorId}/"]}},
        {"semanticMemoryStrategy": {
            "name": "FactExtractor",
            "namespaces": ["/facts/{actorId}/"]}},
    ],
)
print("MEMORY_ID:", memory.get("id"))

The create_memory_and_wait call blocks until the resource is ACTIVE, which takes two to three minutes because AgentCore provisions the extraction infrastructure. Copy the printed id and export it: export AGENTCORE_MEMORY_ID=<the-id>. You will reuse it for every agent run from here on.

Look closely at the namespaces values. A namespace is a path that scopes where extracted records get filed, and it must start and end with a slash. The {actorId} placeholder is substituted at runtime with whatever actor id you set, so /preferences/{actorId}/ becomes /preferences/alice/ for user alice and /preferences/bob/ for user bob. That substitution is how one memory resource cleanly separates many users without you writing any filtering logic. AgentCore also supports {sessionId} and {memoryStrategyId} placeholders for finer scoping.

Step 2: Understand the actor, session, and namespace model

Before wiring the agent, get the three identifiers straight, because mixing them up is the single most common way this goes wrong. An actor is the user, stable across time. An actor named alice is the same Alice on Monday and Tuesday. A session is one continuous conversation, deliberately short-lived. Every time you restart the script you create a new session id. An event is a single turn written into a session.

The whole point of long-term memory is that it is scoped to the actor, not the session. Short-term events live under a session and disappear from the agent's working context when the session ends. Long-term records live under an actor namespace like /preferences/alice/ and persist forever, which is exactly why a new session can still recall them.

So the rule is: keep the actor id stable for a given user, and let the session id rotate freely. In our demo the actor id is hardcoded to dev-user-001 and the session id is generated fresh on every run. That mismatch is the feature. The agent forgets the conversation but remembers the person.

Step 3: Wire the session manager into a Strands agent

Now the payoff. The AgentCoreMemorySessionManager is a Strands session manager: hand it to an Agent and it transparently persists every turn and retrieves relevant long-term records before each model call. You configure it with an AgentCoreMemoryConfig that names the memory id, the session id, the actor id, and a retrieval_config mapping each namespace to how aggressively to search it.

Create assistant.py:

import os, uuid
from strands import Agent
from bedrock_agentcore.memory.integrations.strands.config import (
    AgentCoreMemoryConfig, RetrievalConfig)
from bedrock_agentcore.memory.integrations.strands.session_manager import (
    AgentCoreMemorySessionManager)
MEMORY_ID = os.environ["AGENTCORE_MEMORY_ID"]
ACTOR_ID = "dev-user-001"                      # stable: the person
SESSION_ID = f"session-{uuid.uuid4().hex[:8]}"  # fresh every run
config = AgentCoreMemoryConfig(
    memory_id=MEMORY_ID,
    actor_id=ACTOR_ID,
    session_id=SESSION_ID,
    retrieval_config={
        "/preferences/{actorId}/": RetrievalConfig(top_k=5, relevance_score=0.5),
        "/facts/{actorId}/": RetrievalConfig(top_k=5, relevance_score=0.3),
    },
)
with AgentCoreMemorySessionManager(config, region_name="us-east-1") as sm:
    agent = Agent(
        system_prompt=(
            "You are a developer assistant. Use everything you know about "
            "the user's stack and preferences to give specific advice."),
        session_manager=sm,
    )
    print(f"[session {SESSION_ID}]")
    agent("I work in Python and I refuse to ship unpinned dependencies.")
    agent("I also prefer pytest over unittest, always.")

The retrieval_config is where you tune the recall behavior. top_k caps how many records come back per namespace, and relevance_score is the floor on semantic similarity, from 0.0 to 1.0. A lower threshold like 0.3 on facts casts a wide net; a stricter 0.5 on preferences keeps only close matches. The defaults are top_k=10 and relevance_score=0.2 if you omit them, which is usually too loose for production.

Notice what is not in this code: no explicit "save this turn" call, no "search memory" call, no context assembly. The session manager does all of it. Wrapping it in a with block matters because the manager buffers and flushes messages on exit; if you cannot use a context manager, call sm.close() in a finally block or the last turns may never get written.

Step 4: Teach the agent in one session

Run the teaching script once:

python assistant.py

You will see a session id and the agent's two replies. Behind the scenes, those four turns (your two messages and the agent's two responses) were written to short-term memory under this session, and the extraction pipeline has started chewing on them to populate /preferences/dev-user-001/.

This is the moment to be patient. Extraction is asynchronous. If you immediately start a new session, long-term memory may still be empty because the pipeline has not finished. Wait two or three minutes before the recall test in the next step. You can watch the records appear with a quick poll:

import os
from bedrock_agentcore.memory import MemoryClient
client = MemoryClient(region_name="us-east-1")
records = client.retrieve_memories(
    memory_id=os.environ["AGENTCORE_MEMORY_ID"],
    namespace="/preferences/dev-user-001/",
    query="programming language and dependency preferences",
    top_k=5)
for r in records:
    print(r)

When that prints records mentioning Python, pinned dependencies, and pytest, the extraction has landed and the agent is ready to prove it remembers.

Step 5: Prove it remembers in a brand new session

Make recall.py. It is nearly identical to assistant.py, but it generates a different session id and, crucially, never tells the agent anything about Python or pytest. It just asks a question that only an agent with memory could answer well.

import os, uuid
from strands import Agent
from bedrock_agentcore.memory.integrations.strands.config import (
    AgentCoreMemoryConfig, RetrievalConfig)
from bedrock_agentcore.memory.integrations.strands.session_manager import (
    AgentCoreMemorySessionManager)
config = AgentCoreMemoryConfig(
    memory_id=os.environ["AGENTCORE_MEMORY_ID"],
    actor_id="dev-user-001",                       # same person
    session_id=f"session-{uuid.uuid4().hex[:8]}",  # different conversation
    retrieval_config={
        "/preferences/{actorId}/": RetrievalConfig(top_k=5, relevance_score=0.5),
    },
)
with AgentCoreMemorySessionManager(config, region_name="us-east-1") as sm:
    agent = Agent(
        system_prompt=(
            "You are a developer assistant. Use everything you know about "
            "the user's stack and preferences to give specific advice."),
        session_manager=sm,
    )
    agent("Set up a test and dependency scaffold for my new project. "
          "Use what you already know about how I work.")

The actor id is the same, the session id is new, and the prompt deliberately withholds the answer. A memoryless agent would ask which language you use. This one should scaffold a Python project with pinned dependencies and pytest, because the preference records under /preferences/dev-user-001/ got retrieved and injected before the model ever produced a token.

Verify it works

Run python recall.py after the extraction wait. The contract for success is concrete: the response should mention Python, should pin dependencies (a requirements.txt with == versions or a lockfile), and should choose pytest, none of which you stated in this session. If you see all three, memory works end to end.

For a tighter check, compare against a control. Temporarily change actor_id to dev-user-999, a user the system has never seen, and run again. With no records under /preferences/dev-user-999/, the agent has nothing to retrieve and should either ask what language you use or pick a generic default. The difference between the two runs, same code and same prompt but different actor, is the memory doing its job. That A/B is the cleanest proof you can give yourself that recall is real and not the base model guessing.

When it breaks

The agent does not recall anything. Almost always this is the asynchronous extraction wait. Poll retrieve_memories as in Step 4; if it returns nothing, the pipeline has not finished, so wait longer. If it still returns nothing after five minutes, your strategy namespace and your retrieval_config namespace probably do not match. The string you set on the strategy at creation time must be byte-for-byte the same as the key in retrieval_config.

Namespace validation errors. Every namespace must start and end with a slash. /preferences/{actorId} without the trailing slash will be rejected. If you renamed namespaces, remember the change has to happen in both the strategy definition and the retrieval config.

It recalls the wrong user's data, or none. You almost certainly let the actor id drift. The actor is the user and must be stable; if you generated it fresh per run the way you do session ids, every run looks like a new person with an empty memory. Hardcode or look up the actor id; only the session id should rotate.

Buffered turns never persist. If you skipped the with block and used batch_size greater than 1, messages sit in a local buffer until flushed. Use the context manager or call close() explicitly, or the last turns of a session are lost.

AccessDeniedException on retrieve or create. Your IAM policy is missing a bedrock-agentcore action. The dev policy in Setup is broad on purpose to get you unblocked; if you tightened it, re-check that retrieval and event-write actions are present.

Cost and cleanup

Following this tutorial costs a few cents. AgentCore Memory is billed per use: about $0.25 per 1,000 short-term events, $0.75 per 1,000 long-term records stored per month, and $0.50 per 1,000 retrievals. Our demo writes a handful of events and runs a handful of retrievals, so memory itself rounds to nothing. With the default built-in strategies the extraction and embedding model costs are included in those memory prices, so you are not billed separately for the LLM that distills your turns into records, only for the agent's own model calls through Bedrock, which for a few short prompts is a fraction of a cent. The one bill that quietly grows is the long-term record storage, charged monthly until you delete the resource, which is why cleanup matters even at this scale.

Delete the memory resource when you are done. Short-term events have a minimum retention of seven days and cannot be force-expired faster, so removing the whole resource is the clean way to stop all charges:

import os
from bedrock_agentcore.memory import MemoryClient
client = MemoryClient(region_name="us-east-1")
client.delete_memory(memory_id=os.environ["AGENTCORE_MEMORY_ID"])
print("Deleted memory resource.")

Where to take it next

First, add the summaryMemoryStrategy to the same memory resource so the agent also recalls a running summary of past sessions, not just discrete preferences, and add its namespace to retrieval_config. That gives the agent narrative continuity on top of facts.

Second, swap the hardcoded dev-user-001 for a real authenticated user id, so each person who talks to your agent gets their own preference store automatically through the {actorId} substitution, with zero per-user code.

Third, lift this into AgentCore Runtime from episode 8. The session manager works identically inside the deployed harness, so RepoScout can go from stateless to genuinely personalized without changing its tool code at all. The memory is a sidecar, not a rewrite.

The reframing worth keeping: most "memory" features bolted onto agents are just a longer prompt, the whole transcript stuffed back in until the context window groans. AgentCore Memory is the opposite move. It throws the transcript away on purpose and keeps only the distilled, searchable residue of who the user is. That is what lets an agent remember you for months without paying to re-read every word you ever said. Is your agent carrying a transcript, or does it actually know its user?