SL#53 - Cut Your Lambda Cold Starts from 6 Seconds to 780ms with SnapStart and Priming

What we are building

A Spring Boot Lambda function that returns JSON, deployed behind a Lambda Function URL with response streaming on, with SnapStart enabled and CRaC invoke-priming wired in. The function is intentionally heavy on the init side (Spring Boot + JDBC pool + AWS SDK), because that's where cold starts hurt in real production. The non-obvious choice is what we do during the snapshot phase: we call the function's own handler once with stub data, so the JIT-compiled hot paths get baked into the snapshot rather than being re-warmed on the first real request.

The destination: a public HTTPS endpoint with a p99 cold-start under 800ms, end-to-end, including SDK init and a JDBC connection pool. The same architecture without SnapStart was sitting at 6.2 seconds in AWS's own benchmark.

Prerequisites

You need an AWS account with permissions to create Lambda functions, IAM roles, and CloudWatch log groups. You need the AWS SAM CLI 1.140+, Java 21 (Corretto or Temurin, both work), Maven 3.9+, and Docker running locally if you want sam build --use-container. You need an artillery install (npm install -g [email protected]) to run the benchmark at the end. Pin the SDK versions in pom.xml exactly as shown; SnapStart and CRaC behavior has changed across minor versions.

Prior knowledge: comfortable reading and editing Java, understands the Lambda execution model at the level of "INIT phase happens once per cold start, INVOKE phase happens per request." If you've never used SAM before, the SAM Getting Started guide is a 15-minute prerequisite.

Heads-up on cost: SnapStart adds a per-version caching charge and a per-restoration charge. On Java runtimes the caching is free; on Python and .NET it is not. We are using Java, so the only cost is the standard Lambda per-ms duration. A 90-minute benchmark session runs you well under one dollar.

Setup

Create the project from a SAM template:

sam init --runtime java21 --package-type Zip --app-template hello-world \
  --name sl53-snapstart-demo --no-tracing --no-application-insights
cd sl53-snapstart-demo

Pin your dependencies in pom.xml:

<dependencies>
  <dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-lambda-java-core</artifactId>
    <version>1.2.3</version>
  </dependency>
  <dependency>
    <groupId>org.crac</groupId>
    <artifactId>crac</artifactId>
    <version>1.5.0</version>
  </dependency>
  <dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.11.0</version>
  </dependency>
</dependencies>

Verify the toolchain in one shot:

sam --version  # 1.140+
java -version  # 21
mvn -v         # 3.9+

If all three return what they should, you can build a placeholder and deploy it before touching real code:

sam build && sam deploy --guided --stack-name sl53-snapstart-demo

When --guided asks "Create managed ECR repositories for all functions?" answer N. Save the resulting samconfig.toml so the next deploys are one-liners. A successful first deploy is your smoke test: it proves your credentials, region, and toolchain are wired. If it fails here, fix that before writing a line of CRaC code.

Step 1: Write a handler that costs something to initialize

Replace the generated App.java with a handler that does the kind of work real Spring Boot apps do during init: load a config, build an HTTP client, and pre-warm a small data structure. We are using plain Java rather than Spring to keep the tutorial tight, but the principles transfer one-to-one to Spring Boot's @SpringBootApplication.

// src/main/java/sl53/App.java
package sl53;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.google.gson.Gson;
import java.net.http.HttpClient;
import java.time.Duration;
import java.util.Map;
public class App implements RequestHandler<Map<String,Object>, String> {
  private static final Gson GSON = new Gson();
  private static final HttpClient HTTP = HttpClient.newBuilder()
      .connectTimeout(Duration.ofSeconds(2)).build();
  private static final Map<String,String> CONFIG = loadConfig();
  private static Map<String,String> loadConfig() {
    try { Thread.sleep(1500); } catch (InterruptedException ignored) {}
    return Map.of("region", "eu-west-3", "feature_x", "on");
  }
  @Override
  public String handleRequest(Map<String,Object> evt, Context ctx) {
    return GSON.toJson(Map.of(
      "ok", true,
      "config", CONFIG,
      "initType", System.getenv("AWS_LAMBDA_INITIALIZATION_TYPE")));
  }
}

The Thread.sleep(1500) inside loadConfig() stands in for the seconds of real init work Spring or Quarkus would do: classpath scanning, dependency injection, JDBC pool warmup. Without SnapStart, every cold start pays this 1.5 second tax. With SnapStart, this method runs once at version-publish time and never again. The handler itself is intentionally trivial; the point of this tutorial is the init path.

Step 2: Enable SnapStart and publish a version

SnapStart only runs on published versions, not on $LATEST. Edit template.yaml:

Resources:
  Sl53Function:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: .
      Handler: sl53.App::handleRequest
      Runtime: java21
      Architectures: [arm64]
      MemorySize: 1024
      Timeout: 10
      AutoPublishAlias: live
      SnapStart:
        ApplyOn: PublishedVersions
      FunctionUrlConfig:
        AuthType: NONE
        InvokeMode: RESPONSE_STREAM
  Sl53Url:
    Type: AWS::Lambda::Url
    DependsOn: Sl53FunctionAliaslive
    Properties:
      TargetFunctionArn: !Ref Sl53FunctionAliaslive
      AuthType: NONE
      InvokeMode: RESPONSE_STREAM

Three things matter here. AutoPublishAlias: live tells SAM to publish a new version on every deploy and re-point the live alias at it. SnapStart.ApplyOn: PublishedVersions tells Lambda to snapshot each published version. Architectures: [arm64] shaves another 15-40% off the restore latency for free, per AWS's published benchmarks.

Deploy:

sam build && sam deploy

The first deploy with SnapStart enabled takes 1-2 minutes longer than usual, because Lambda is running your init code, taking the Firecracker microVM snapshot, encrypting it, and pre-caching copies for resilience. That cost is paid once per version, not per request. Grab the FunctionUrl from the stack outputs.

Step 3: Add CRaC hooks for invoke priming

This is the move that takes you from "SnapStart with no priming" (1.4s p99.9 in AWS's benchmark) to "SnapStart with invoke priming" (781ms). The idea: during the INIT phase, before Lambda takes the snapshot, run your real handler against fake input. The JIT compiler kicks in, the hot classes get loaded, and all of that ends up baked into the snapshot. When a real request arrives, the JVM is already warm.

Modify App.java to implement org.crac.Resource:

package sl53;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.google.gson.Gson;
import org.crac.Core;
import org.crac.Resource;
import java.net.http.HttpClient;
import java.time.Duration;
import java.util.Map;
public class App implements RequestHandler<Map<String,Object>, String>, Resource {
  private static final Gson GSON = new Gson();
  private static final HttpClient HTTP = HttpClient.newBuilder()
      .connectTimeout(Duration.ofSeconds(2)).build();
  private static final Map<String,String> CONFIG = loadConfig();
  public App() {
    Core.getGlobalContext().register(this);
  }
  private static Map<String,String> loadConfig() {
    try { Thread.sleep(1500); } catch (InterruptedException ignored) {}
    return Map.of("region", "eu-west-3", "feature_x", "on");
  }
  @Override
  public String handleRequest(Map<String,Object> evt, Context ctx) {
    return GSON.toJson(Map.of(
      "ok", true,
      "config", CONFIG,
      "initType", System.getenv("AWS_LAMBDA_INITIALIZATION_TYPE")));
  }
  @Override
  public void beforeCheckpoint(org.crac.Context<? extends Resource> ctx) {
    handleRequest(Map.of("priming", true), null);
  }
  @Override
  public void afterRestore(org.crac.Context<? extends Resource> ctx) {
  }
}

The contract is precise. Core.getGlobalContext().register(this) in the constructor tells the CRaC runtime to call your hooks at snapshot time. beforeCheckpoint runs once, immediately before Lambda freezes the microVM. afterRestore runs every time the snapshot is resumed into a new execution environment. Use afterRestore for anything that must be unique per-environment (database connections, request IDs) or that must not have stale data (cached tokens, time-sensitive credentials).

Invoke priming is fast but only safe when the priming call is idempotent. If handleRequest writes to a database or charges a customer, invoke priming will do that at deploy time, which is almost certainly not what you want. For non-idempotent handlers, use the class priming variant: read the classes-loaded.txt produced by -Xlog:class+load=info:classes-loaded.txt and call Class.forName on each entry in beforeCheckpoint. You get a smaller win (1085ms vs 781ms in AWS's benchmark) but no side effects.

Step 4: Deploy and capture the version

Build and redeploy:

sam build && sam deploy

SAM publishes a new version, Lambda re-runs init with your CRaC hooks active, takes a fresh snapshot, and updates the live alias. Watch the CloudWatch logs for the new version's log stream. You'll see one log line marked INIT_REPORT with the total init duration, including the time spent in beforeCheckpoint. That's the work you pay for once, at deploy time, instead of paying for it on every cold start.

Save the URL into a shell variable so the benchmark step is one line:

URL=$(aws cloudformation describe-stacks \
  --stack-name sl53-snapstart-demo \
  --query "Stacks[0].Outputs[?OutputKey=='Sl53UrlEndpoint'].OutputValue" \
  --output text)
echo "$URL"

If Sl53UrlEndpoint isn't in your outputs, add it to template.yaml:

Outputs:
  Sl53UrlEndpoint:
    Value: !GetAtt Sl53Url.FunctionUrl

Then sam deploy once more.

Verify it works

Hit the endpoint once cold (wait at least 15 minutes since the last invocation, or scale up enough to force a fresh execution environment) and check the response:

curl -s "$URL" | python3 -m json.tool

Expected output:

{
  "ok": true,
  "config": { "region": "eu-west-3", "feature_x": "on" },
  "initType": "snap-start"
}

The "initType": "snap-start" field is the contract: Lambda only sets AWS_LAMBDA_INITIALIZATION_TYPE=snap-start when the environment was resumed from a snapshot. If you see on-demand instead, SnapStart isn't active for this version; double-check ApplyOn: PublishedVersions and that you are hitting the live alias and not $LATEST.

Now drive load through the endpoint and measure the cold-start distribution:

artillery quick --count 200 --num 1 "$URL" --output report.json
artillery report report.json

In a fresh AWS account with no warm execution environments, you should see a p99 between 500ms and 900ms. In AWS's published benchmark on a heavier Spring Boot app with RDS Proxy connections, the same configuration measured 608ms p50 and 781ms p99.9. Compare against running the same handler without SnapStart (set SnapStart.ApplyOn: None and redeploy) and you'll see p99 climb above 3 seconds.

When it breaks

Five failure modes account for most of the support questions on this stack.

"My init runs every request, not once." You're invoking $LATEST instead of the published version. Confirm AutoPublishAlias: live is in your template, then hit ${URL}/live or use the alias-pointing FunctionUrl, not the raw function ARN.

Resource is not on the classpath. The org.crac:crac dependency must be on the runtime classpath, not just compile. In Maven that means no <scope>provided</scope>. If you see ClassNotFoundException: org.crac.Resource in CloudWatch, fix the scope.

Snapshot caching cost is unexpectedly high. This bites on Python and .NET runtimes, not Java. On Python/.NET you pay for every version cached for at least 3 hours. Delete unused versions with the Lambda Version Cleanup pattern or accept the cost.

Stale connections after restore. If your handler holds a JDBC connection or an HTTP client that was opened in init, the connection's underlying socket does not survive the snapshot. Refresh it in afterRestore. AWS SDK clients refresh themselves automatically; other clients do not.

Invoke priming caused a real side effect. You ran a non-idempotent handler in beforeCheckpoint and it actually wrote to your database at deploy time. Switch to class priming (Class.forName in beforeCheckpoint) or gate priming behind a "priming": true check in the event payload, the way the handler above does.

Where to take it next

Three concrete extensions, ordered easiest to hardest.

The first is to swap your priming from invoke to class priming and compare the cold-start numbers. Run your app locally with -Xlog:class+load=info:classes-loaded.txt, bundle the file into the function package, and have beforeCheckpoint call Class.forName on each entry. The win is smaller (about 200ms in AWS's benchmark) but it is the only safe option for non-idempotent handlers.

The second is to attach this Lambda to a real workload: put an API Gateway HTTP API in front of it, give it an RDS Proxy connection in init, and re-measure. You'll see that init time triples and so does the SnapStart win. Production-realistic workloads are where SnapStart earns its keep.

The third is to set up a CloudWatch dashboard tracking InitDuration and RestoreDuration by alias, alarmed at p99. The numbers in this tutorial were a snapshot in time. The point of a dashboard is to catch the day your priming logic drifts and the p99 silently doubles.

The reframe to take away from this: cold-start optimization stopped being a black-box "buy more provisioned concurrency" decision somewhere around 2024. With SnapStart and CRaC hooks, the slow part of your Lambda is something you can profile, snapshot, and ship. The teams shipping the fastest serverless APIs in 2026 are the ones that treat the INIT phase as code they own, not as infrastructure they pay around.