SL#66 - AWS AI Series (8/30) - From Laptop to Prod: Deploy Your RepoScout Agent to Bedrock AgentCore Runtime

This is episode 8 of the AWS AI Series, and the one where the agent leaves your laptop. In episode 7 we built RepoScout, a Strands agent with three custom tools that answers questions about a code directory. It ran on your machine, with your personal credentials, in your shell. That is the right place for a tool you use. It is the wrong place for a tool your team uses, a tool that has to be reachable as an endpoint, scale to many concurrent sessions, and run with its own identity instead of your ~/.aws/credentials.

AgentCore Runtime is the AWS service that closes that gap. It is a serverless runtime purpose-built for agents: you hand it your Python entrypoint, it packages your code, boots it in a dedicated microVM per session, and exposes it through the InvokeAgentRuntime API. The promise from the end of episode 7 was that the agent code barely changes. That is true, and this episode shows exactly how true. The interesting part is not the deployment command. It is the one thing about your agent that quietly stops working the moment it stops running on your disk, and what you do about it.

What we are building

The deliverable is RepoScout running in AgentCore Runtime instead of on your laptop, invocable two ways: from the agentcore CLI and from a five-line boto3 script that any service in your account could run. Same three tools, same model-driven loop, same answers. What changes is where the code executes and how you reach it.

Under the hood, you wrap the agent in the AgentCore harness (BedrockAgentCoreApp), which turns your function into an HTTP service with a /invocations endpoint and a /ping health check. The starter toolkit CLI then builds an ARM64 package, creates an execution role and an S3 artifact, and registers an agent runtime. When you invoke it, AgentCore boots a fresh microVM, runs your code, streams the result back, and tears the microVM down. Each session is fully isolated from every other session, which is the security property you cannot easily get by hand.

The non-obvious design choice this episode forces: RepoScout reads files from disk, and in a microVM there is no "your repo" on disk. The data the agent needs has to travel with the agent. We will make that explicit rather than letting it fail in production.

Prerequisites

You need everything from episode 7 working first: Python 3.10 or newer, an AWS account with Amazon Bedrock model access granted for Anthropic Claude Sonnet 4, and AWS credentials configured locally. If you skipped episode 7, the five-minute version is at the end of Setup.

New for this episode, your AWS credentials need more than Bedrock invoke permissions. Deploying with the starter toolkit creates IAM roles, an S3 bucket, a CodeBuild project, and an AgentCore runtime, so the identity you run the CLI with needs permission to create those. The cleanest path for a personal account is to run as an admin or a role with broad create permissions; for a locked-down account, AWS documents the exact caller policy under Use the starter toolkit. You do not need Docker. The default deployment mode builds in AWS CodeBuild, not locally.

One region note up front. The starter toolkit defaults to us-west-2, and this whole episode uses us-west-2. Make sure Claude Sonnet 4 model access is enabled in us-west-2 specifically, not just in the region you used for episode 7. Model access is per region, and a mismatch here is the single most common reason the deployed agent returns an access error.

Setup

Start a fresh project folder. We are not editing the episode-7 folder, we are building a deployable version of it.

mkdir reposcout-runtime && cd reposcout-runtime
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install --upgrade pip
pip install "bedrock-agentcore==1.14.0" "strands-agents==1.42.0" "bedrock-agentcore-starter-toolkit==0.3.9"

Three packages, three jobs. strands-agents is the agent framework from episode 7. bedrock-agentcore is the runtime SDK that provides the BedrockAgentCoreApp harness, the HTTP wrapper your code runs inside once deployed. bedrock-agentcore-starter-toolkit is the agentcore CLI that does the configure, deploy, invoke, and teardown. Pin the versions so this tutorial still reproduces in three months; the AgentCore stack ships changes monthly.

Verify the CLI is on your path:

agentcore --help

You should see a usage banner listing configure, deploy, invoke, status, and destroy. If agentcore is not found, your virtualenv is not active or the toolkit did not install. Fix that before going further.

If you did not do episode 7, here is the agent we are deploying, condensed. RepoScout has three tools: list_files lists a directory, read_file reads one file truncated to 8000 characters, and search_text greps the tree with a 40-hit cap. All three are sandboxed to a single root directory and degrade to an error string instead of raising. The agent is one Strands Agent instance with those three tools and a system prompt that tells it to cite the files and line numbers it used. The full source is in the episode-7 post. You will paste those three tool functions into the file we build in Step 1.

Step 1: Wrap RepoScout in the AgentCore harness

The harness is the adapter between your agent and the runtime. Create repo_agent.py and start with the imports and the bundled-code path, which is the part that matters most:

from pathlib import Path
from strands import Agent, tool
from strands.models import BedrockModel
from bedrock_agentcore import BedrockAgentCoreApp

# The agent scouts the code that ships inside the deployment, not your laptop.
ROOT = (Path(__file__).parent / "repo").resolve()

That ROOT line is the whole lesson of this episode in one statement. In episode 7, ROOT was Path("."), the directory you ran the script from, which on your laptop was a real repo you cared about. Inside a microVM there is no such directory. The only files that exist are the ones that got packaged and shipped. So we point ROOT at a repo/ subfolder that we will deploy alongside the agent. The agent scouts the code that traveled with it. If you forget this and leave ROOT = Path("."), the deployed agent will run, take your question, call list_files, and report that the project is empty or contains only its own runtime files. It will not error. It will just be confidently useless, which is worse.

Now paste the three tools from episode 7 (list_files, read_file, search_text) directly below that line, unchanged. They already reference ROOT, so they will resolve against the bundled repo/ folder with no edits. Then assemble the agent and wrap it in the harness:

model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
    region_name="us-west-2",
    temperature=0.2,
)

SYSTEM = (
    "You are RepoScout, a precise code assistant. Answer questions about "
    "the project by using your tools to inspect real files. Always cite the "
    "file paths and line numbers you used. If you are unsure, say so."
)

agent = Agent(model=model, system_prompt=SYSTEM,
              tools=[list_files, read_file, search_text])

app = BedrockAgentCoreApp()

@app.entrypoint
def invoke(payload):
    question = payload.get("prompt", "")
    result = agent(question)
    return {"result": str(result)}

if __name__ == "__main__":
    app.run()

The agent definition is identical to episode 7 except the region is us-west-2 to match where we deploy. The new part is the bottom six lines. BedrockAgentCoreApp() creates the HTTP service. The @app.entrypoint decorator registers invoke as the handler behind the /invocations route. AgentCore hands your function a payload dict parsed from the request body, you pull prompt out of it, run the agent, and return a JSON-serializable dict. str(result) gives the agent's final text answer, which is what we want to send back. app.run() starts the server when you execute the file directly, on port 8080, the port AgentCore expects.

That is the entire contract. No web framework, no route definitions, no request parsing. The harness is doing the HTTP work so your agent code stays agent code.

Step 2: Bundle the code the agent scouts

The agent needs something to scout. Create the repo/ folder and drop a small codebase into it. The natural choice is the episode-7 agent itself, so RepoScout can answer questions about RepoScout:

mkdir repo
# Put any code you want the agent to answer questions about here.
# The episode-7 reposcout.py is a good, self-referential choice:
cp ../reposcout/reposcout.py repo/   # adjust the path to wherever yours is

Anything in repo/ will be packaged and deployed with the agent. This is the deliberate, visible answer to the filesystem problem from Step 1. The agent does not reach back to your machine at invocation time; it reads files that are already inside its microVM because you shipped them.

This also surfaces a real architectural decision you will hit on every agent you deploy. Static data the agent needs at runtime, like a small reference corpus or a config, can ride along in the package. Data that is large, private, or changes often should not be baked into the image. For those, the agent should fetch at runtime through a tool: read from S3, clone a repo, or call an API. RepoScout with a bundled repo/ is the simplest end of that spectrum. Episode 10, when we expose tools through AgentCore Gateway, is the other end. For now, bundling keeps the deploy reproducible and the lesson clear.

Step 3: Test the invocations contract locally

Before paying for a deploy, confirm the harness works on your machine. The harness runs the same HTTP server locally that it runs in the cloud, so a local test exercises the real contract. Start the agent:

python repo_agent.py

It will bind to http://localhost:8080. In a second terminal, hit the /invocations endpoint the same way AgentCore will:

curl -X POST http://localhost:8080/invocations \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What custom tools does this project define?"}'

You should get back a JSON object whose result field names list_files, read_file, and search_text, with file and line citations, because the agent read the bundled repo/reposcout.py to answer. If you see that, three things are proven at once: your harness wiring is correct, your bundled code is reachable, and your Bedrock credentials and model access work. Stop the server with Ctrl+C.

If the response says the project is empty, your repo/ folder is empty or ROOT is pointing at the wrong place. Fix it now, locally, where the feedback loop is seconds instead of minutes. The reason we test the exact /invocations shape rather than calling agent() directly is that this is the contract AgentCore speaks. A local test that skips the HTTP layer can pass while the deployed agent fails on a payload-parsing mismatch.

Step 4: Configure the agent for AWS

Now hand the project to the toolkit. The configure command inspects your entrypoint and writes a deployment config:

agentcore configure -e repo_agent.py --disable-memory -r us-west-2

The -e flag names your entrypoint file. -r us-west-2 pins the region. --disable-memory skips the memory prompt; persistent memory is episode 9, and adding it here would provision resources we do not need yet. The command writes a hidden .bedrock_agentcore.yaml holding the agent name, the entrypoint, the deployment mode, and the execution role it will use.

When prompted for an execution role, let the toolkit create one. This is the identity your agent runs as inside the microVM, and it is the cleanest illustration of why you deploy agents this way. On your laptop, RepoScout ran as you, with whatever your credentials could touch. Deployed, it runs as a dedicated IAM role scoped to exactly what the agent needs, with no human's keys involved. The toolkit provisions a role that can write CloudWatch logs and pull from the deployment bucket. It will also include Bedrock invoke permissions, but keep the cross-region inference detail from episode 7 in mind: if the deployed agent later returns an access error on the model, the fix is to attach the same inference-profile-plus-foundation-model policy from episode 7 to this execution role, because cross-region inference can route the call to a region the default role policy did not anticipate.

Confirm the config landed:

cat .bedrock_agentcore.yaml

You should see your entrypoint, us-west-2, and a deployment mode of direct code deploy, which is the default and needs no Docker.

Step 5: Deploy, then invoke

One command builds and ships everything:

agentcore deploy

This packages your code and your repo/ folder, uploads the artifact to S3, builds an ARM64 (Graviton) image in AWS CodeBuild, creates the execution role if needed, and registers an AgentCore runtime. The first deploy takes a few minutes, most of it CodeBuild. When it finishes, the output prints the agent runtime ARN and the CloudWatch log group. Copy the ARN; you need it for the boto3 path. You can also check readiness any time with:

agentcore status

Once status shows the runtime ready, invoke it from the CLI:

agentcore invoke '{"prompt": "Which functions enforce that paths stay inside the project root?"}'

That JSON string is the same payload shape you curled locally. AgentCore boots a microVM, runs your agent, and returns the answer. The agent should call search_text, find the startswith checks in the bundled code, and cite them with line numbers. Same behavior as your laptop, now running on isolated managed compute.

The other way to invoke is from code, which is how a real service would call it. Create invoke_agent.py:

import json
import uuid
import boto3

agent_arn = "PASTE_YOUR_AGENT_ARN_HERE"
client = boto3.client("bedrock-agentcore", region_name="us-west-2")

response = client.invoke_agent_runtime(
    agentRuntimeArn=agent_arn,
    runtimeSessionId=str(uuid.uuid4()),
    payload=json.dumps({"prompt": "List the project's files."}).encode(),
    qualifier="DEFAULT",
)

content = [chunk.decode("utf-8") for chunk in response.get("response", [])]
print(json.loads("".join(content)))

Run it with python invoke_agent.py. The runtimeSessionId is a fresh UUID per call here, which gives you a clean isolated session each time. Reuse the same session ID across calls and AgentCore keeps the session's microVM warm for continuity, which matters once the agent has memory. The bedrock-agentcore:InvokeAgentRuntime permission is what any calling identity needs, and nothing about your laptop credentials is involved in the agent's own execution. That separation, caller identity versus agent identity, is the production property you came here for.

Verify it works

You have a working deployment when three checks pass. First, agentcore status reports the runtime as ready, not creating or failed. Second, agentcore invoke '{"prompt": "What custom tools does this project define?"}' returns an answer that names list_files, read_file, and search_text with citations from repo/reposcout.py, proving the bundled code shipped and the agent can read it inside the microVM. Third, the boto3 script prints the same kind of structured answer, proving the runtime is reachable through the AWS API and not just the CLI.

If you want to watch it work, stream the logs while you invoke:

agentcore status            # confirm ready
agentcore invoke '{"prompt": "Which functions enforce path sandboxing?"}'

In the CloudWatch log group printed at deploy time (/aws/bedrock-agentcore/runtimes/...) you will see the microVM boot, the tool calls, and the model invocation for that session, then the session end. Seeing a tool-call line followed by a cited answer is the unambiguous "it works" moment. If the answer is specific and grounded in the bundled files, the tutorial worked.

When it breaks

AccessDeniedException on the model after deploy, but it worked locally. Model access is per region. You almost certainly enabled Claude Sonnet 4 in the region you used for episode 7 and not in us-west-2. Enable it in the Bedrock console for us-west-2, or attach the episode-7 inference-profile policy to the execution role if cross-region inference is routing the call elsewhere.

The deployed agent says the project is empty or only sees runtime files. Your repo/ folder did not get bundled, or ROOT still points at Path("."). Confirm repo/ has files locally, confirm ROOT is Path(__file__).parent / "repo", and redeploy. This is the episode's headline failure mode for a reason.

CodeBuild fails during deploy. The caller identity is missing CodeBuild or S3 permissions, or your account is not set up for the build. Read the CodeBuild project logs in the console, and verify your caller policy includes the toolkit's required actions. This is a permissions problem, not a code problem.

agentcore command not found. Your virtualenv is not active. Run source .venv/bin/activate and retry. The CLI lives in the venv, not globally.

Port 8080 already in use during the local test. Another process holds the port. On Mac or Linux, lsof -ti:8080 | xargs kill -9, then restart the agent.

Estimated AWS cost

AgentCore Runtime bills consumption, not provisioned capacity: $0.0895 per vCPU-hour and $0.00945 per GB-hour, per second, with a one-second minimum and 128MB minimum memory. The trick that makes this cheap for agents is that CPU is only billed during active processing. RepoScout spends most of each session waiting on the model, and that I/O wait is free. A typical RepoScout invocation runs maybe ten seconds of active CPU on one vCPU and peaks well under a gigabyte of memory, so the runtime compute for a single question is a fraction of a cent. Running every example in this tutorial a dozen times stays in the low pennies of compute.

The Bedrock token cost is the same as episode 7 and dominates: roughly $0.03 to $0.06 per question at Claude Sonnet 4's on-demand rates, because each file the agent reads is fed back into the next model call. Direct code deploy stores your artifact in S3 at standard rates (a few-megabyte package is a fraction of a cent per month, billed from late February 2026). The first deploy runs a short CodeBuild job, which for a small ARM build is minutes and often inside the free tier. Check the AgentCore pricing page and the Bedrock pricing page for current rates. The whole episode costs well under a dollar if you clean up after.

Cleanup

The runtime does not bill while idle, but the artifacts persist, so tear it down:

agentcore destroy

That deletes the AgentCore runtime. Then confirm nothing is left billing or lingering: check S3 for the deployment bucket artifact and delete it if you do not want it, check CloudWatch Logs for the /aws/bedrock-agentcore/runtimes/... log group, and check IAM for the BedrockAgentCore execution role the toolkit created. Direct code deploy creates no ECR repository, so there is no container image to remove. If you enabled anything from earlier episodes that bills hourly, like a provisioned throughput or an evaluation job, confirm those are gone too; nothing in this episode creates them.

Where to take it next

Three extensions, easiest to hardest. First, give the boto3 caller a stable runtimeSessionId and invoke twice in a row, then watch the CloudWatch logs to see AgentCore keep the same microVM warm across calls; this is the mechanic episode 9 builds on for memory. Second, replace the bundled repo/ folder with a tool that reads files from an S3 bucket instead, so RepoScout can scout a codebase you update without redeploying; this is the realistic version of the filesystem decision from Step 2. Third, swap the model to us.amazon.nova-pro-v1:0 and redeploy, then compare answer quality, latency, and the runtime bill on identical questions, since the change is one line and the deploy is one command.

The arc of this series is that each episode hands the next one a running thing instead of a diagram. You now have an agent in a managed runtime with its own identity, reachable by API, isolated per session. Episode 9 makes it remember you across those sessions. The agent you deployed today is the agent that gets the memory; you will not rebuild it. That is the payoff of the harness: the deploy boundary is a six-line wrapper, not a rewrite.