SL#71 - AWS AI Series (12/30) - Agents You Can Audit: Lock an AgentCore Agent Behind JWT Auth and Trace Every Call in CloudWatch

❝

We take a Strands agent on Bedrock AgentCore Runtime, put a Cognito JWT wall in front of it so only authenticated callers get in, and wire OpenTelemetry so every invocation shows up as a traced session in the CloudWatch GenAI Observability dashboard. Plan for 90 minutes if you finished episode 8, about 3 hours from scratch.

What we are building

By the end you will have one agent runtime that does two things most demo agents skip: it refuses anonymous callers, and it tells you exactly what it did on every request. The access wall is AgentCore Identity inbound auth, configured as a custom JWT authorizer backed by an Amazon Cognito user pool. A caller without a valid bearer token gets a 401 before your code runs. The audit trail is AgentCore Observability: with the AWS Distro for OpenTelemetry (ADOT) in your image and Transaction Search enabled in CloudWatch, every invocation becomes a session made of traces and spans, each span carrying timing, token usage, tool calls, and errors.

The non-obvious design choice is that you do not write any of the tracing code. AgentCore Runtime auto-instruments the agent at the container level, and Identity issues a workload identity for the runtime automatically the moment you attach a JWT authorizer. Your job is to turn on the two AWS-side switches and then read the dashboards. The finished thing: a deployed agent ARN you can only invoke with a token, plus a CloudWatch view where you can click one session and watch the model call, the tool call, and the latency of each, with the caller's identity attached.

Prerequisites

You need an AWS account with Bedrock model access enabled in your region (us-east-1 here), the AWS CLI v2 configured, Python 3.10+, Docker running locally (the starter toolkit builds a container image), and jq installed for the Cognito script. You should be comfortable reading Python and shell, and you should have invoked an AgentCore agent at least once before. If you did episode 8 you already have a deployed RepoScout agent and the IAM scaffolding; we will add auth and tracing to that shape. If you skipped it, the Setup section below gives you a 5-minute agent to deploy from zero so you are not blocked.

One account-level cost note up front. AgentCore Runtime bills active consumption at roughly $0.0895 per vCPU-hour and $0.00945 per GB-hour as of April 2026, and CloudWatch observability is consumption-based with no built-in cap: you pay for span ingestion, storage, and queries. Following this tutorial end to end costs well under $1, but the observability bill is the one to watch in production because it scales with traffic, not with a flat fee.

Setup

Create a clean project and virtual environment, and pin the pieces that matter.

mkdir agentcore-audited && cd agentcore-audited
python3 -m venv .venv && source .venv/bin/activate
pip install bedrock-agentcore-starter-toolkit 'strands-agents[otel]' strands-agents-tools

The [otel] extra on strands-agents is what makes the framework emit OpenTelemetry spans. Now write the agent. If you have your episode 8 agent, reuse it; otherwise this is a complete, deployable agent.

# agent.py
from strands import Agent, tool
from strands.models import BedrockModel
from bedrock_agentcore.runtime import BedrockAgentCoreApp

app = BedrockAgentCoreApp()

@tool
def word_count(text: str) -> int:
    """Count the words in a piece of text."""
    return len(text.split())

model = BedrockModel(model_id="us.anthropic.claude-3-7-sonnet-20250219-v1:0")
agent = Agent(model=model, tools=[word_count],
              system_prompt="You are a concise writing assistant.")

@app.entrypoint
def invoke(payload, context):
    result = agent(payload.get("prompt", ""))
    return result.message["content"][0]["text"]

if __name__ == "__main__":
    app.run()

Create a requirements.txt next to it. The aws-opentelemetry-distro line is mandatory: it is the ADOT package AgentCore's auto-instrumentation looks for at runtime.

strands-agents[otel]
strands-agents-tools
aws-opentelemetry-distro
bedrock-agentcore

Smoke test locally before touching AWS: python agent.py should start the app server without error. Stop it with Ctrl-C. If that runs clean, the agent is structurally valid and you are ready to wire the two switches.

Step 1: Turn on Transaction Search (one time per account)

Before any trace can show up, CloudWatch needs Transaction Search enabled. This is a one-time per-account setup, and skipping it is the single most common reason people deploy a "traced" agent and see nothing in the dashboard. Do it from the CLI in two calls.

aws logs put-resource-policy --policy-name AgentCoreTxnSearch \
  --policy-document '{"Version":"2012-10-17","Statement":[{"Sid":"TxnSearchXRay","Effect":"Allow","Principal":{"Service":"xray.amazonaws.com"},"Action":"logs:PutLogEvents","Resource":["arn:aws:logs:us-east-1:ACCOUNT_ID:log-group:aws/spans:*","arn:aws:logs:us-east-1:ACCOUNT_ID:log-group:/aws/application-signals/data:*"],"Condition":{"ArnLike":{"aws:SourceArn":"arn:aws:xray:us-east-1:ACCOUNT_ID:*"},"StringEquals":{"aws:SourceAccount":"ACCOUNT_ID"}}}]}'

aws xray update-trace-segment-destination --destination CloudWatchLogs

Replace ACCOUNT_ID with your 12-digit account number. The first call lets X-Ray write span data into the aws/spans log group; the second tells X-Ray to send segments to CloudWatch Logs, which is what the GenAI Observability dashboard reads. After enabling, it can take up to ten minutes for spans to become searchable, so do this early. By default you index 1% of traces at no charge, which is plenty for a tutorial. You can raise the sampling percentage later with aws xray update-indexing-rule once you know your traffic.

If you prefer the console, it is under CloudWatch, Settings, the X-Ray traces tab, Transaction Search, View settings, Edit, Enable. Wait until "Ingest OpenTelemetry spans" reads Enabled before you send any traffic.

Step 2: Stand up a Cognito user pool for inbound auth

Inbound auth needs an OpenID Connect issuer that mints JWTs. Cognito is the path of least resistance. This script creates a user pool, an app client with password auth, a test user, and prints back the three values the authorizer needs: the discovery URL, the client ID, and a bearer token.

# setup_cognito.sh
export REGION=us-east-1 USERNAME=testuser PASSWORD='Tutorial#2026'
export POOL_ID=$(aws cognito-idp create-user-pool --pool-name AgentCorePool \
  --policies '{"PasswordPolicy":{"MinimumLength":8}}' --region $REGION | jq -r '.UserPool.Id')
export CLIENT_ID=$(aws cognito-idp create-user-pool-client --user-pool-id $POOL_ID \
  --client-name AgentCoreClient --no-generate-secret \
  --explicit-auth-flows ALLOW_USER_PASSWORD_AUTH ALLOW_REFRESH_TOKEN_AUTH \
  --region $REGION | jq -r '.UserPoolClient.ClientId')
aws cognito-idp admin-create-user --user-pool-id $POOL_ID --username $USERNAME \
  --region $REGION --message-action SUPPRESS > /dev/null
aws cognito-idp admin-set-user-password --user-pool-id $POOL_ID --username $USERNAME \
  --password $PASSWORD --region $REGION --permanent > /dev/null
echo "Discovery URL: https://cognito-idp.$REGION.amazonaws.com/$POOL_ID/.well-known/openid-configuration"
echo "Client ID: $CLIENT_ID"

Run it with source setup_cognito.sh so the variables stay in your shell. Save the discovery URL and client ID. The discovery URL must end in /.well-known/openid-configuration; the authorizer validates it against that exact pattern. The reason this matters is that the issuer inside that document has to match the iss claim in every token Cognito later signs, and the client ID has to match the token's client_id claim. Get either wrong and you will spend an afternoon staring at 403s.

Step 3: Deploy the agent with a JWT authorizer attached

Now configure and launch the runtime, but this time pass an authorizer_configuration. That single argument is what flips the runtime from default IAM SigV4 auth to JWT bearer-token auth. A runtime can use one or the other, not both at once.

# deploy.py
from bedrock_agentcore_starter_toolkit import Runtime
import os

runtime = Runtime()
runtime.configure(
    entrypoint="agent.py",
    auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region="us-east-1",
    agent_name="audited_agent",
    authorizer_configuration={
        "customJWTAuthorizer": {
            "discoveryUrl": os.environ["DISCOVERY_URL"],
            "allowedClients": [os.environ["CLIENT_ID"]],
        }
    },
)
result = runtime.launch()
print(result)

Export the two values from Step 2 and run it: export DISCOVERY_URL=... CLIENT_ID=... && python deploy.py. The toolkit builds the container (this is where Docker is needed), pushes it to a new ECR repo, creates an execution role, and registers the runtime with the JWT authorizer baked in. Note the agent runtime ARN in the output.

Two things happen automatically that are worth naming, because they used to be manual. First, when you create a runtime with a JWT authorizer, AgentCore Identity creates a workload identity for it without any extra call. Second, since October 13, 2025, workload-identity permissions come from a service-linked role named AWSServiceRoleForBedrockAgentCoreRuntimeIdentity instead of inline policies on your execution role. You only need to make sure the principal running deploy.py is allowed to create that service-linked role, which auto_create_execution_role=True handles for new accounts. If you are on an agent created before that date, you still need the older GetWorkloadAccessToken policy on the execution role.

Step 4: Invoke with a bearer token and a session ID

This is where access control becomes real. boto3 cannot invoke a JWT-protected runtime because it always signs with SigV4, so you call the runtime's HTTPS endpoint directly with the bearer token. Fetch a fresh token first, because Cognito access tokens expire after 60 minutes by default.

# invoke.py
import os, json, urllib.parse, requests

token = os.environ["TOKEN"]
arn = os.environ["AGENT_ARN"]
escaped = urllib.parse.quote(arn, safe="")
url = f"https://bedrock-agentcore.us-east-1.amazonaws.com/runtimes/{escaped}/invocations?qualifier=DEFAULT"

resp = requests.post(url,
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id": "demo-session-001",
    },
    data=json.dumps({"prompt": "How many words are in this sentence?"}))
print(resp.status_code, resp.text[:400])

Get a token and call it:

export TOKEN=$(aws cognito-idp initiate-auth --client-id "$CLIENT_ID" \
  --auth-flow USER_PASSWORD_AUTH \
  --auth-parameters USERNAME=testuser,PASSWORD='Tutorial#2026' \
  --region us-east-1 | jq -r '.AuthenticationResult.AccessToken')
export AGENT_ARN=<arn from step 3>
python invoke.py

A 200 with the agent's answer means the token passed the authorizer and your code ran. Now prove the wall works: run python invoke.py again with TOKEN set to garbage, or drop the Authorization header entirely. You should get a 401 Unauthorized with a WWW-Authenticate: Bearer header pointing at the protected-resource metadata endpoint. That 401 is the whole point. Your agent code never executed for the anonymous caller. The Session-Id header you sent groups this call and any follow-ups into one session in the dashboard, which is what you look at next.

Step 5: Read the audit trail in CloudWatch

Open the Bedrock AgentCore GenAI Observability dashboard in the CloudWatch console (the GenAI Observability section, Bedrock AgentCore tab). It has three views that map onto the data model. The Agents view lists every agent, runtime-hosted or not; click audited_agent to see runtime metrics. The Sessions view lists sessions, so demo-session-001 from Step 4 should be there. The Traces view shows individual traces and spans: click one trace and you get the trajectory, a span for the model call with token counts and latency, and a span for the word_count tool call. Sessions contain traces, traces contain spans. That three-level hierarchy is the audit trail.

For raw metrics, go to CloudWatch, Metrics, All metrics, and open the Bedrock-AgentCore namespace: session count, latency, duration, token usage, and error rate are all there and can drive alarms. For the security side of the audit, the access record lives in CloudTrail, not the GenAI dashboard. Every InvokeAgentRuntime call is logged with the calling principal, so you can answer "who invoked this agent and when" by querying CloudTrail for that event name. Tracing tells you what the agent did; CloudTrail tells you who asked it to. Together they are what "auditable" actually means.

If you want traces from a session to correlate across multiple process runs (useful when an agent is invoked repeatedly for one user), set the session ID into OpenTelemetry baggage in your agent so every span inherits it:

from opentelemetry import baggage, context
context.attach(baggage.set_baggage("session.id", "demo-session-001"))

Verify it works

You have three checkpoints, and all three must be green. First, the happy path: python invoke.py with a valid TOKEN returns 200 and a sensible answer like a word count. Second, the wall: the same script with a missing or bogus token returns 401 Unauthorized and a WWW-Authenticate: Bearer resource_metadata="..." header, and crucially no agent output. Third, the trace: within a couple of minutes of a successful call, the Sessions view of the GenAI Observability dashboard shows demo-session-001, and drilling into its trace shows at least two spans, one for the Bedrock model call and one for the word_count tool, each with a non-zero duration. If you see the 200, the 401, and the session with its spans, the agent is both access-controlled and audited. That is the contract for this episode.

When it breaks

If the dashboard stays empty after a successful invocation, Transaction Search is almost certainly not enabled or not finished propagating. Re-check Step 1 and remember it can take ten minutes; confirm the console shows "Ingest OpenTelemetry spans: Enabled." If you see traces but no spans for the model or tool, your image is missing aws-opentelemetry-distro in requirements.txt or you installed strands-agents without the [otel] extra. Rebuild and redeploy after fixing the requirements.

If every authenticated call returns 403 instead of 200, the token's claims do not match the authorizer. Decode the token payload with echo "$TOKEN" | cut -d '.' -f2 | base64 -D | jq and check that iss equals the issuer in your discovery document and that client_id equals the CLIENT_ID you passed to allowedClients. A mismatch on either is the usual culprit. If you get 401 even with what you think is a good token, it has probably expired; Cognito tokens last 60 minutes, so re-run the initiate-auth command. Finally, if deploy.py fails creating the service-linked role, the deploying principal lacks iam:CreateServiceLinkedRole for runtime-identity.bedrock-agentcore.amazonaws.com; add that permission and retry.

Where to take it next

First, easy: swap the test user's password flow for a real scope check by adding allowedScopes to the authorizer and minting tokens with a custom scope, so the agent rejects valid users who lack the right permission, not just anonymous ones. Second, medium: add outbound auth so the agent can call a third party on the user's behalf, using a credential provider and the @requires_access_token decorator from the AgentCore SDK to run a three-legged OAuth flow into something like Google Drive. Third, harder: pipe the same OTEL spans to a second backend. AgentCore Observability is OTEL-compatible, so the identical telemetry can fan out to Datadog, Langfuse, or Arize Phoenix alongside CloudWatch by adding an exporter, which is how teams avoid locking their agent traces into one vendor. The question worth sitting with: once every agent call is access-controlled and traced, what is your retention policy on the prompts and tool arguments those spans capture, and who in your org is allowed to read them?

Clean up

Leaving an agent runtime registered is cheap when idle, but the observability ingestion and the Cognito pool are loose ends worth closing. Delete the runtime with the toolkit or the control API, then remove the Cognito pool and, if you created a throwaway ECR repo, that too.

# delete the agent runtime (use the ID from the ARN, not the full ARN)
aws bedrock-agentcore-control delete-agent-runtime \
  --agent-runtime-id <runtime-id> --region us-east-1
# delete the Cognito user pool
aws cognito-idp delete-user-pool --user-pool-id "$POOL_ID" --region us-east-1

The bigger ongoing cost is CloudWatch: span data already ingested sits in the aws/spans log group and bills for storage until it ages out. If you want it gone now, set a short retention on that log group with aws logs put-retention-policy --log-group-name aws/spans --retention-in-days 1, or lower the Transaction Search sampling back down with update-indexing-rule so future runs ingest less. Leaving Transaction Search enabled at 1% is fine; it is the per-account default cost floor and is free at that rate.