3.3 KiB
Cold Starts
Init Duration vs warm path. Mitigations: Provisioned Concurrency, arm64, lazy imports, smaller packages, SnapStart.
What triggers a cold start
A cold start happens whenever Lambda must create a new execution environment: the very first request after a deployment, when traffic spikes beyond the number of warm environments, and after an environment has been idle long enough to be recycled (typically 5–15 minutes, unspecified by AWS). Deployments always cold-start the incoming version — you can't avoid the first one, only reduce how long it takes.
The cold path
AWS provisions a Firecracker microVM, downloads and unpacks your code (or pulls the container image), starts the language runtime, then runs your module-level code. Only after all of that does your handler function get called. The timeline is roughly:
- Environment provisioning — microVM boot, network attachment, filesystem mount. Not billed; AWS absorbs this.
- Init phase — your module-level code: imports, client construction, config reads. Billed at full configured memory. Capped at 10 s.
- Handler phase —
handler(event, context)runs. Billed per-ms.
CloudWatch shows this split: the REPORT line includes Init Duration only on cold invocations. Warm invocations have no Init Duration line.
Typical numbers
| Runtime | Typical cold start (p50) | Typical cold start (p99) |
|---|---|---|
| Python 3.13 (zip, minimal deps) | ~150 ms | ~400 ms |
| Python 3.13 (zip, aioboto3 + aiofiles) | ~300 ms | ~700 ms |
| Node.js 22 | ~100 ms | ~300 ms |
| Java 21 (without SnapStart) | ~1–2 s | ~3–5 s |
| Java 21 (SnapStart enabled) | ~200 ms | ~600 ms |
| Container image (any runtime) | +100–300 ms | first pull can be 1–3 s |
Mitigations
Provisioned Concurrency (PC) — pre-warms N environments so they're always in the "warm" state. Eliminates cold starts for the provisioned slots. You pay for those slots 24/7 even when idle. Use for latency-sensitive, predictable-traffic paths. Schedule PC changes via Application Auto Scaling for cost efficiency.
arm64 — Graviton2 executes the init phase ~10% faster than x86_64 for CPU-bound init work. Combined with the ~20% price reduction, arm64 is the default choice unless native wheels block you.
Smaller packages — Lambda downloads and unpacks your zip on every cold start. Trimming unused transitive dependencies (use pip install --no-deps audit or pipdeptree) and stripping test/doc files shaves real time. Every MB of extracted code costs a few ms.
Lazy imports — move rarely-used or slow imports inside the handler (or into a lazy-init guard). The most common win is heavy ML libraries only needed for inference: import them on first call, cache the result in a module-level variable.
SnapStart (Java only) — takes a snapshot of the initialised JVM state after your init phase, then restores from that snapshot on cold starts. Collapses 1–5 s JVM startup to ~200 ms. Not available for Python or Node.
When cold starts don't matter: batch jobs, async event pipelines, scheduled tasks — nobody is waiting on the p99. Only optimise cold starts when a human is waiting synchronously for the response.