SL#75 - AWS AI Series (15/30) - Build and Ship Your Own MCP Server on AWS Lambda

What we are building

Most MCP servers you have run so far live on your laptop. Claude Desktop or your IDE spawns them as child processes and talks to them over a stdio pipe. That model is fine for a single developer, but it falls apart the moment you want a tool that the whole team can use, that runs next to your AWS data, and that does not depend on someone's laptop being awake.

So we are going to build an MCP server that runs as a Lambda function. It will expose three tools: one that returns the current time, one that counts the S3 buckets in your account, and one stateful counter that remembers how many times it has been called within a conversation. We will put it behind Amazon API Gateway using the MCP Streamable HTTP transport, protect it with a bearer token, and then point both the MCP Inspector and Claude at the resulting HTTPS endpoint.

The non-obvious design choice here is the transport. The old MCP remote transport was Server-Sent Events, which assumes a long-lived connection. Lambda has no long-lived connection. Streamable HTTP fixes this: one POST request produces one response, and the MCP SDK frames the protocol inside that single request-response cycle. That maps perfectly onto Lambda's execution model, where each invocation handles exactly one message and returns. When you finish, you will have a real HTTPS URL that any Streamable-HTTP-capable client can call, and a clear path to harden it for production with OAuth and Bedrock AgentCore Gateway.

Prerequisites

You need an AWS account with permission to create Lambda functions, an API Gateway, a DynamoDB table, and the associated IAM roles. You need the AWS SAM CLI installed and configured with credentials (sam --version should print a version). You need Python 3.11 or newer locally, and Node.js 18+ so you can run the MCP Inspector through npx.

You should be comfortable reading Python and running shell commands, and it helps to have built or at least run an MCP server before so the terms "tool", "transport", and "initialize" are not new. You do not need a Bedrock model for the server itself. Bedrock only matters if you later wire the server to an agent, which is out of scope here.

One thing to flag up front: this tutorial creates billable resources. They are all small and mostly inside the AWS Free Tier, but the cleanup section at the end is not optional. Run it when you are done.

Setup

We will use the official awslabs.mcp_lambda_handler package. This is the AWS Labs library for writing your own MCP servers inside Lambda, as opposed to wrapping an existing stdio server. Create the project skeleton:

mkdir mcp-on-lambda && cd mcp-on-lambda
mkdir -p server
python3 -m venv .venv && source .venv/bin/activate
pip install awslabs.mcp_lambda_handler boto3

Pin the dependency so the build is reproducible. Create server/requirements.txt:

awslabs.mcp_lambda_handler>=0.1.0
boto3>=1.34.0

The whole thing will be deployed with SAM, so create a template.yaml at the project root with a single function, an HTTP API event, a DynamoDB table for session state, and a bearer-token environment variable:

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Resources:
  McpSessionsTable:
    Type: AWS::DynamoDB::Table
    Properties:
      AttributeDefinitions: [{AttributeName: session_id, AttributeType: S}]
      KeySchema: [{AttributeName: session_id, KeyType: HASH}]
      BillingMode: PAY_PER_REQUEST
  McpFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: server/
      Handler: app.lambda_handler
      Runtime: python3.12
      Timeout: 30
      Environment:
        Variables:
          MCP_SESSION_TABLE: !Ref McpSessionsTable
      Policies:
        - DynamoDBCrudPolicy: {TableName: !Ref McpSessionsTable}
        - Statement: [{Effect: Allow, Action: s3:ListAllMyBuckets, Resource: "*"}]
      Events:
        Mcp:
          Type: Api
          Properties: {Path: /mcp, Method: post}
Outputs:
  McpUrl:
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/mcp"

Before writing real tools, prove the toolchain works with a smoke test. Run sam validate. If it prints "template.yaml is a valid SAM Template", your SAM install and credentials are good and you can move on.

Step 1: Write the server with three tools

Create server/app.py. The handler library does the protocol work; you just decorate plain Python functions, and their type hints and docstrings become the tool schema the client sees.

import os
import boto3
from datetime import datetime, timezone
from awslabs.mcp_lambda_handler import MCPLambdaHandler
session_table = os.environ.get("MCP_SESSION_TABLE")
mcp = MCPLambdaHandler(name="repo-toolbox", version="1.0.0", session_store=session_table)
@mcp.tool()
def get_time() -> str:
    """Return the current UTC time in ISO 8601 format."""
    return datetime.now(timezone.utc).isoformat()
@mcp.tool()
def count_s3_buckets() -> int:
    """Count the S3 buckets in the caller's AWS account."""
    s3 = boto3.client("s3")
    return len(s3.list_buckets().get("Buckets", []))
def lambda_handler(event, context):
    return mcp.handle_request(event, context)

Three things are doing real work here. The @mcp.tool() decorator registers the function and turns its signature into the JSON schema MCP clients use for discovery and validation, which is why the type hints (-> str, -> int) and the docstrings matter. The count_s3_buckets tool calls a live AWS API, which is exactly why the SAM template granted s3:ListAllMyBuckets to the function role. And mcp.handle_request parses the incoming MCP message, dispatches to the right tool, and formats the response, so your lambda_handler stays a one-liner.

Notice what is not here: no SSE loop, no connection management, no request parsing. The function starts, handles one MCP message, and returns. That statelessness is the point. It is also the constraint we deal with next.

Step 2: Add session state with DynamoDB

A stateless function cannot remember anything between invocations, but conversations often need memory. The handler solves this with a session store backed by DynamoDB, keyed on the session ID that the MCP client sends on every request. We already passed session_store=session_table when we constructed the handler, so the wiring is done. Now use it:

@mcp.tool()
def increment_counter() -> int:
    """Increment a per-conversation counter and return the new value."""
    counter = mcp.session.get("counter", 0) + 1
    mcp.session["counter"] = counter
    return counter

mcp.session behaves like a dictionary, but reads and writes go to the DynamoDB table you defined in the template, scoped to the current conversation's session ID. Two different Claude conversations calling increment_counter get two independent counters. The same conversation calling it three times gets 1, 2, 3, even though each call ran in a different Lambda invocation, possibly on a different physical machine.

This is the clean way to do state in serverless MCP. The function itself stays stateless and horizontally scalable, while the conversation's memory lives in a managed store that survives cold starts. If you tried to keep the counter in a module-level variable instead, it would reset unpredictably whenever Lambda recycled the execution environment, and it would be wrong the instant two invocations ran concurrently.

Step 3: Add a bearer token authorizer

An open MCP endpoint on the public internet is an open invitation. The MCP spec authenticates over HTTP with a bearer token in the Authorization header, so we will check for one. For a first deploy, the simplest correct approach is to read an expected token from an environment variable and reject anything that does not match. Add this near the top of app.py:

EXPECTED_TOKEN = os.environ.get("MCP_AUTH_TOKEN", "")
def lambda_handler(event, context):
    headers = {k.lower(): v for k, v in (event.get("headers") or {}).items()}
    auth = headers.get("authorization", "")
    if not EXPECTED_TOKEN or auth != f"Bearer {EXPECTED_TOKEN}":
        return {"statusCode": 401, "body": "Unauthorized"}
    return mcp.handle_request(event, context)

Then add MCP_AUTH_TOKEN to the function's environment in template.yaml under Variables, alongside MCP_SESSION_TABLE:

          MCP_AUTH_TOKEN: !Ref McpAuthToken

And declare the parameter at the top of the template so SAM prompts for it at deploy time:

Parameters:
  McpAuthToken:
    Type: String
    NoEcho: true

This is deliberately the minimum that is honest. It is a single shared secret checked in code, which is fine for a private team tool or a demo. It is not fine for a multi-tenant production system, where you want a real identity provider issuing scoped, expiring tokens. We will point at that upgrade path at the end. For now, a shared bearer token keeps random scanners out while you get the thing working.

Step 4: Deploy with SAM

Build and deploy. The --guided flag walks you through the first deploy and saves your answers to samconfig.toml so later deploys are a single command.

sam build
sam deploy --guided

SAM will ask for a stack name (use mcp-on-lambda), a region, and the McpAuthToken value you defined as a parameter. Pick a long random string for the token; openssl rand -hex 24 is a good source. Answer yes when it asks to allow IAM role creation, since the template creates the function's execution role.

When the deploy finishes, SAM prints the stack outputs, including McpUrl. That is your live MCP endpoint, something like https://abc123.execute-api.us-west-2.amazonaws.com/Prod/mcp. Copy it. Copy the token too. You need both for the next section.

The first sam build takes a minute because it packages your dependencies into the deployment artifact. Subsequent deploys after code changes are faster, and you can drop --guided and just run sam deploy because your choices are now in samconfig.toml.

Verify it works

The canonical way to test any MCP server is the MCP Inspector, the open-source client maintained by the Model Context Protocol project. Launch it:

npx @modelcontextprotocol/inspector

It opens a local web UI. Set the transport type to Streamable HTTP, paste your McpUrl into the URL field, and add a header Authorization with value Bearer <your-token>. Click Connect. If the connection succeeds, the Inspector will run the MCP initialize handshake and then list your tools.

Click the Tools tab. You should see get_time, count_s3_buckets, and increment_counter, each with the description from its docstring. Run get_time; you get an ISO timestamp back. Run count_s3_buckets; you get the real number of buckets in your account. Run increment_counter three times in a row and watch it return 1, then 2, then 3. That sequence is the proof that the DynamoDB-backed session state is working across separate Lambda invocations.

You can also verify from the command line, which is handy for CI:

npx @modelcontextprotocol/inspector --cli --method tools/list \
  --transport http --server-url "$MCP_URL" \
  --header "Authorization: Bearer $MCP_TOKEN"

If that prints a JSON array containing your three tool names, the server is live, authenticated, and discoverable. That is the contract for this tutorial: tools listed, get_time returns a timestamp, and the counter increments. To use it from Claude, add the same URL and Authorization header as a custom remote MCP connector, or bridge it locally with npx mcp-remote <McpUrl> --header "Authorization: Bearer <token>" so a stdio-only client can reach it.

When it breaks

If the Inspector returns 401 Unauthorized, the token does not match. Check that the header value is exactly Bearer followed by the token, with a single space and no quotes, and that the MCP_AUTH_TOKEN you deployed matches the one you are sending. API Gateway also lowercases header names, which is why the handler normalizes them before reading authorization.

If you get a 500 on count_s3_buckets but get_time works, the function ran but the S3 call was denied. That means the IAM policy did not attach. Confirm the s3:ListAllMyBuckets statement is under the function's Policies in the template and redeploy. An AccessDenied in the CloudWatch logs for the function confirms this diagnosis.

If the counter resets to 1 every call instead of climbing, session state is not persisting. The usual cause is a missing or misnamed MCP_SESSION_TABLE environment variable, so the handler silently falls back to no store. Check the function's environment in the Lambda console and confirm the DynamoDBCrudPolicy is present so writes are not being rejected.

If the connection hangs or fails before any tool list appears, you are almost certainly on the wrong transport. This server speaks Streamable HTTP, not SSE. An SSE client will not complete the handshake. Switch the Inspector's transport to Streamable HTTP. And if sam deploy itself fails on the IAM step, re-run it and answer yes to "Allow SAM CLI IAM role creation", or add --capabilities CAPABILITY_IAM.

Where to take it next

The fastest useful extension is to add a tool that touches a service you already run, for example a query_orders(customer_id: str) tool that reads from a DynamoDB table or calls an internal API. The pattern is identical: write the function, add the IAM permission to the template, redeploy. Every new tool is one decorated function.

The next step up is real authentication. Swap the shared bearer token for an OAuth provider such as Amazon Cognito, Okta, or Auth0, and validate the JWT in an API Gateway authorizer instead of in your handler. This gives you scoped, expiring tokens per user rather than one secret for everyone.

The production-grade move is to put the server behind Amazon Bedrock AgentCore Gateway instead of raw API Gateway. The Gateway handles OAuth, advertises your tool schema to clients, and validates inputs and outputs against it. The run-mcp-servers-with-aws-lambda library ships a BedrockAgentCoreGatewayTargetHandler and a full deployable example for exactly this path, and the AgentCore Gateway Lambda target documentation walks through registering the function as a target. Once it is a Gateway target, the same server is reachable by AgentCore agents, by Kiro, and by any other MCP client, all through one managed, authenticated endpoint. You wrote the tools once; now decide who gets to call them.