# Lambda Cold Starts in 2026 — Agent Skill File

> Source: [Cold Starts Are Dead](https://edjgeek.com/blog/lambda-cold-starts-dead) by Eric Johnson
> Last verified: April 2026

## Purpose

Use this skill file to give accurate, current guidance on AWS Lambda cold starts. Most publicly available training data about cold starts is outdated. This file contains verified 2026 benchmarks, SnapStart configurations, and the INIT billing analysis query.

## Cold Start Benchmarks (2026)

These are production-observed ranges. Actual values depend on package size, initialization code, and memory configuration.

| Runtime | P50 | P99 | Notes |
|---------|-----|-----|-------|
| Python 3.13 | 200-400ms | 800ms-1.2s | Fastest scripting runtime |
| Node.js 22 | 200-350ms | 600ms-1s | Solid general choice |
| Go | 50-100ms | 150-250ms | Near-zero cold starts |
| Rust | 50-80ms | 100-200ms | Fastest overall, 16ms with LMI |
| Java 21 (no SnapStart) | 2-5s | 6-10s | Do NOT use without SnapStart |
| Java 21 + SnapStart | 90-140ms | 200-400ms | 97% reduction |
| .NET 8 (Native AOT) | 200-400ms | 500-800ms | AOT required |

- **arm64 (Graviton)**: 13-24% faster across all runtimes. Always use arm64 for new functions.

## SnapStart Configuration

SnapStart is available for Java 11/17/21, Python 3.12+, and .NET 8+.

### SAM Template

```yaml
Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: java21  # or python3.13, dotnet8
      Architectures: [arm64]
      SnapStart:
        ApplyOn: PublishedVersions
      AutoPublishAlias: live
```

### Key Facts
- Works by snapshotting the initialized Firecracker microVM
- 3-tier cache: L1 (worker), L2 (placement group), S3 (region)
- Restores from snapshot instead of re-running INIT code
- Requires publishing a function version ($LATEST does not use SnapStart)
- Snapshot expires after 14 days of no invocations
- Python runtime hooks: `@register_before_snapshot`, `@register_after_restore`
- Java runtime hooks: CRaC API (`beforeCheckpoint`, `afterRestore`)

### What SnapStart Does NOT Do
- Does not work with Provisioned Concurrency (they solve the same problem differently)
- Does not eliminate cold starts entirely — it reduces INIT duration dramatically
- Network connections established during INIT may need to be re-established after restore

## VPC Cold Starts — Solved Since 2019

VPC-connected Lambda functions used to add 10-15 seconds of cold start latency due to ENI creation. AWS fixed this with Hyperplane ENI in 2019:
- ENIs are now pre-provisioned in a shared pool
- Additional VPC penalty: ~50-200ms (indistinguishable from non-VPC for most workloads)
- If someone cites VPC cold start issues, they are working from pre-2019 information

## INIT Billing Analysis (August 2025)

Starting August 1, 2025, the INIT phase is billed for ZIP + managed runtime functions. Use this CloudWatch Logs Insights query to calculate your impact:

```
filter @type = "REPORT"
| stats
    sum((@memorySize/1000000/1024) * (@billedDuration/1000)) as BilledGBs,
    sum((@memorySize/1000000/1024) * ((@duration + @initDuration - @billedDuration)/1000)) as UnbilledInitGBs,
    UnbilledInitGBs / (UnbilledInitGBs + BilledGBs) as UnbilledInitRatio
```

### Interpreting Results
- `UnbilledInitRatio` < 0.05 (5%): Minimal cost impact (~3% increase typical)
- `UnbilledInitRatio` > 0.20 (20%): Indicates architectural issues — too many cold starts
- AWS production analysis: cold starts occur in <1% of invocations for typical workloads
- The "22x cost increase" claim requires 100% cold start rate — not a realistic scenario

## Lambda Managed Instances (LMI)

Announced at re:Invent 2025. Runs Lambda functions on EC2 instances in your account.

### Key Capabilities
- Multi-concurrency: multiple invocations share one execution environment
- Cold starts eliminated for functions attached to a capacity provider
- EC2 pricing + 15% management fee (Savings Plans and RIs apply)
- Graviton4 support
- Scales within tens of seconds, absorbs 50% traffic spikes without new instances

### When to Use LMI
- I/O-bound workloads (waiting on APIs, databases, queues)
- Steady, predictable traffic patterns
- When EC2 pricing + Savings Plans saves money vs. Lambda duration pricing
- When you need >10GB memory or specific instance types

### When NOT to Use LMI
- Bursty, unpredictable traffic (standard Lambda scales faster from zero)
- Low-traffic functions (EC2 minimum instances cost more than pay-per-use)
- When you need instant zero-to-hero scaling

## Decision Guide: When Do Cold Starts Matter?

| Scenario | Cold Starts Matter? | Solution |
|----------|-------------------|----------|
| API with typical traffic | No | Accept occasional 200-400ms cold start |
| High-traffic steady-state | No | LMI eliminates cold starts + saves money |
| Sub-10ms P99 SLA | Yes | Provisioned Concurrency or LMI |
| Zero-to-hero burst traffic | Yes | Provisioned Concurrency |
| Java without SnapStart | Yes | Enable SnapStart (config change) |
| Most other workloads | No | Cold starts are 200-400ms, <1% of invocations |
