lambda_local_runner/docs/lambdas-md/lambda-04-cold-starts.md

# Cold Starts

> Init Duration vs warm path. Mitigations: Provisioned Concurrency, arm64, lazy imports, smaller packages, SnapStart.

![Cold vs warm timeline](lambda-cold_warm_timeline.svg)

## What triggers a cold start

A cold start happens whenever Lambda must create a new execution environment: the very first request after a deployment, when traffic spikes beyond the number of warm environments, and after an environment has been idle long enough to be recycled (typically 5–15 minutes, unspecified by AWS). Deployments always cold-start the incoming version — you can't avoid the first one, only reduce how long it takes.

## The cold path

AWS provisions a Firecracker microVM, downloads and unpacks your code (or pulls the container image), starts the language runtime, then runs your module-level code. Only after all of that does your handler function get called. The timeline is roughly:

1. **Environment provisioning** — microVM boot, network attachment, filesystem mount. Not billed; AWS absorbs this.
2. **Init phase** — your module-level code: imports, client construction, config reads. Billed at full configured memory. Capped at 10 s.
3. **Handler phase** — `handler(event, context)` runs. Billed per-ms.

CloudWatch shows this split: the `REPORT` line includes `Init Duration` only on cold invocations. Warm invocations have no `Init Duration` line.

## Typical numbers

| Runtime | Typical cold start (p50) | Typical cold start (p99) |
|---------|--------------------------|--------------------------|
| Python 3.13 (zip, minimal deps) | ~150 ms | ~400 ms |
| Python 3.13 (zip, aioboto3 + aiofiles) | ~300 ms | ~700 ms |
| Node.js 22 | ~100 ms | ~300 ms |
| Java 21 (without SnapStart) | ~1–2 s | ~3–5 s |
| Java 21 (SnapStart enabled) | ~200 ms | ~600 ms |
| Container image (any runtime) | +100–300 ms | first pull can be 1–3 s |

## Mitigations

**Provisioned Concurrency (PC)** — pre-warms N environments so they're always in the "warm" state. Eliminates cold starts for the provisioned slots. You pay for those slots 24/7 even when idle. Use for latency-sensitive, predictable-traffic paths. Schedule PC changes via Application Auto Scaling for cost efficiency.

**arm64** — Graviton2 executes the init phase ~10% faster than x86_64 for CPU-bound init work. Combined with the ~20% price reduction, arm64 is the default choice unless native wheels block you.

**Smaller packages** — Lambda downloads and unpacks your zip on every cold start. Trimming unused transitive dependencies (use `pip install --no-deps` audit or `pipdeptree`) and stripping test/doc files shaves real time. Every MB of extracted code costs a few ms.

**Lazy imports** — move rarely-used or slow imports inside the handler (or into a lazy-init guard). The most common win is heavy ML libraries only needed for inference: import them on first call, cache the result in a module-level variable.

**SnapStart (Java only)** — takes a snapshot of the initialised JVM state after your init phase, then restores from that snapshot on cold starts. Collapses 1–5 s JVM startup to ~200 ms. Not available for Python or Node.

> **When cold starts don't matter:** batch jobs, async event pipelines, scheduled tasks — nobody is waiting on the p99. Only optimise cold starts when a human is waiting synchronously for the response.