update docs

2026-05-11 20:13:11 -03:00
commit 2ffabb672e
40 changed files with 5869 additions and 0 deletions
--- a/docs/lambdas-md/lambda-04-cold-starts.md
+++ b/docs/lambdas-md/lambda-04-cold-starts.md
@@ -0,0 +1,44 @@
+# Cold Starts
+
+> Init Duration vs warm path. Mitigations: Provisioned Concurrency, arm64, lazy imports, smaller packages, SnapStart.
+
+![Cold vs warm timeline](lambda-cold_warm_timeline.svg)
+
+## What triggers a cold start
+
+A cold start happens whenever Lambda must create a new execution environment: the very first request after a deployment, when traffic spikes beyond the number of warm environments, and after an environment has been idle long enough to be recycled (typically 5–15 minutes, unspecified by AWS). Deployments always cold-start the incoming version — you can't avoid the first one, only reduce how long it takes.
+
+## The cold path
+
+AWS provisions a Firecracker microVM, downloads and unpacks your code (or pulls the container image), starts the language runtime, then runs your module-level code. Only after all of that does your handler function get called. The timeline is roughly:
+
+1. **Environment provisioning** — microVM boot, network attachment, filesystem mount. Not billed; AWS absorbs this.
+2. **Init phase** — your module-level code: imports, client construction, config reads. Billed at full configured memory. Capped at 10 s.
+3. **Handler phase** — `handler(event, context)` runs. Billed per-ms.
+
+CloudWatch shows this split: the `REPORT` line includes `Init Duration` only on cold invocations. Warm invocations have no `Init Duration` line.
+
+## Typical numbers
+
+| Runtime | Typical cold start (p50) | Typical cold start (p99) |
+|---------|--------------------------|--------------------------|
+| Python 3.13 (zip, minimal deps) | ~150 ms | ~400 ms |
+| Python 3.13 (zip, aioboto3 + aiofiles) | ~300 ms | ~700 ms |
+| Node.js 22 | ~100 ms | ~300 ms |
+| Java 21 (without SnapStart) | ~1–2 s | ~3–5 s |
+| Java 21 (SnapStart enabled) | ~200 ms | ~600 ms |
+| Container image (any runtime) | +100–300 ms | first pull can be 1–3 s |
+
+## Mitigations
+
+**Provisioned Concurrency (PC)** — pre-warms N environments so they're always in the "warm" state. Eliminates cold starts for the provisioned slots. You pay for those slots 24/7 even when idle. Use for latency-sensitive, predictable-traffic paths. Schedule PC changes via Application Auto Scaling for cost efficiency.
+
+**arm64** — Graviton2 executes the init phase ~10% faster than x86_64 for CPU-bound init work. Combined with the ~20% price reduction, arm64 is the default choice unless native wheels block you.
+
+**Smaller packages** — Lambda downloads and unpacks your zip on every cold start. Trimming unused transitive dependencies (use `pip install --no-deps` audit or `pipdeptree`) and stripping test/doc files shaves real time. Every MB of extracted code costs a few ms.
+
+**Lazy imports** — move rarely-used or slow imports inside the handler (or into a lazy-init guard). The most common win is heavy ML libraries only needed for inference: import them on first call, cache the result in a module-level variable.
+
+**SnapStart (Java only)** — takes a snapshot of the initialised JVM state after your init phase, then restores from that snapshot on cold starts. Collapses 1–5 s JVM startup to ~200 ms. Not available for Python or Node.
+
+> **When cold starts don't matter:** batch jobs, async event pipelines, scheduled tasks — nobody is waiting on the p99. Only optimise cold starts when a human is waiting synchronously for the response.