Files
lambda_local_runner/docs/lambdas-md/lambda-04-cold-starts.md
2026-05-11 20:13:11 -03:00

3.3 KiB
Raw Blame History

Cold Starts

Init Duration vs warm path. Mitigations: Provisioned Concurrency, arm64, lazy imports, smaller packages, SnapStart.

Cold vs warm timeline

What triggers a cold start

A cold start happens whenever Lambda must create a new execution environment: the very first request after a deployment, when traffic spikes beyond the number of warm environments, and after an environment has been idle long enough to be recycled (typically 515 minutes, unspecified by AWS). Deployments always cold-start the incoming version — you can't avoid the first one, only reduce how long it takes.

The cold path

AWS provisions a Firecracker microVM, downloads and unpacks your code (or pulls the container image), starts the language runtime, then runs your module-level code. Only after all of that does your handler function get called. The timeline is roughly:

  1. Environment provisioning — microVM boot, network attachment, filesystem mount. Not billed; AWS absorbs this.
  2. Init phase — your module-level code: imports, client construction, config reads. Billed at full configured memory. Capped at 10 s.
  3. Handler phasehandler(event, context) runs. Billed per-ms.

CloudWatch shows this split: the REPORT line includes Init Duration only on cold invocations. Warm invocations have no Init Duration line.

Typical numbers

Runtime Typical cold start (p50) Typical cold start (p99)
Python 3.13 (zip, minimal deps) ~150 ms ~400 ms
Python 3.13 (zip, aioboto3 + aiofiles) ~300 ms ~700 ms
Node.js 22 ~100 ms ~300 ms
Java 21 (without SnapStart) ~12 s ~35 s
Java 21 (SnapStart enabled) ~200 ms ~600 ms
Container image (any runtime) +100300 ms first pull can be 13 s

Mitigations

Provisioned Concurrency (PC) — pre-warms N environments so they're always in the "warm" state. Eliminates cold starts for the provisioned slots. You pay for those slots 24/7 even when idle. Use for latency-sensitive, predictable-traffic paths. Schedule PC changes via Application Auto Scaling for cost efficiency.

arm64 — Graviton2 executes the init phase ~10% faster than x86_64 for CPU-bound init work. Combined with the ~20% price reduction, arm64 is the default choice unless native wheels block you.

Smaller packages — Lambda downloads and unpacks your zip on every cold start. Trimming unused transitive dependencies (use pip install --no-deps audit or pipdeptree) and stripping test/doc files shaves real time. Every MB of extracted code costs a few ms.

Lazy imports — move rarely-used or slow imports inside the handler (or into a lazy-init guard). The most common win is heavy ML libraries only needed for inference: import them on first call, cache the result in a module-level variable.

SnapStart (Java only) — takes a snapshot of the initialised JVM state after your init phase, then restores from that snapshot on cold starts. Collapses 15 s JVM startup to ~200 ms. Not available for Python or Node.

When cold starts don't matter: batch jobs, async event pipelines, scheduled tasks — nobody is waiting on the p99. Only optimise cold starts when a human is waiting synchronously for the response.