Files
lambda_local_runner/docs/lambdas-md/lambda-04-cold-starts.md
2026-05-11 20:13:11 -03:00

45 lines
3.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Cold Starts
> Init Duration vs warm path. Mitigations: Provisioned Concurrency, arm64, lazy imports, smaller packages, SnapStart.
![Cold vs warm timeline](lambda-cold_warm_timeline.svg)
## What triggers a cold start
A cold start happens whenever Lambda must create a new execution environment: the very first request after a deployment, when traffic spikes beyond the number of warm environments, and after an environment has been idle long enough to be recycled (typically 515 minutes, unspecified by AWS). Deployments always cold-start the incoming version — you can't avoid the first one, only reduce how long it takes.
## The cold path
AWS provisions a Firecracker microVM, downloads and unpacks your code (or pulls the container image), starts the language runtime, then runs your module-level code. Only after all of that does your handler function get called. The timeline is roughly:
1. **Environment provisioning** — microVM boot, network attachment, filesystem mount. Not billed; AWS absorbs this.
2. **Init phase** — your module-level code: imports, client construction, config reads. Billed at full configured memory. Capped at 10 s.
3. **Handler phase**`handler(event, context)` runs. Billed per-ms.
CloudWatch shows this split: the `REPORT` line includes `Init Duration` only on cold invocations. Warm invocations have no `Init Duration` line.
## Typical numbers
| Runtime | Typical cold start (p50) | Typical cold start (p99) |
|---------|--------------------------|--------------------------|
| Python 3.13 (zip, minimal deps) | ~150 ms | ~400 ms |
| Python 3.13 (zip, aioboto3 + aiofiles) | ~300 ms | ~700 ms |
| Node.js 22 | ~100 ms | ~300 ms |
| Java 21 (without SnapStart) | ~12 s | ~35 s |
| Java 21 (SnapStart enabled) | ~200 ms | ~600 ms |
| Container image (any runtime) | +100300 ms | first pull can be 13 s |
## Mitigations
**Provisioned Concurrency (PC)** — pre-warms N environments so they're always in the "warm" state. Eliminates cold starts for the provisioned slots. You pay for those slots 24/7 even when idle. Use for latency-sensitive, predictable-traffic paths. Schedule PC changes via Application Auto Scaling for cost efficiency.
**arm64** — Graviton2 executes the init phase ~10% faster than x86_64 for CPU-bound init work. Combined with the ~20% price reduction, arm64 is the default choice unless native wheels block you.
**Smaller packages** — Lambda downloads and unpacks your zip on every cold start. Trimming unused transitive dependencies (use `pip install --no-deps` audit or `pipdeptree`) and stripping test/doc files shaves real time. Every MB of extracted code costs a few ms.
**Lazy imports** — move rarely-used or slow imports inside the handler (or into a lazy-init guard). The most common win is heavy ML libraries only needed for inference: import them on first call, cache the result in a module-level variable.
**SnapStart (Java only)** — takes a snapshot of the initialised JVM state after your init phase, then restores from that snapshot on cold starts. Collapses 15 s JVM startup to ~200 ms. Not available for Python or Node.
> **When cold starts don't matter:** batch jobs, async event pipelines, scheduled tasks — nobody is waiting on the p99. Only optimise cold starts when a human is waiting synchronously for the response.