update docs

2026-05-11 20:13:11 -03:00
commit 2ffabb672e
40 changed files with 5869 additions and 0 deletions
--- a/docs/lambdas-md/lambda-05-concurrency.md
+++ b/docs/lambdas-md/lambda-05-concurrency.md
@@ -0,0 +1,35 @@
+# Concurrency
+
+> Account quota, reserved, provisioned. The "100 RPS × 200 ms" math.
+
+## The fundamental model
+
+Lambda concurrency = the number of execution environments processing requests at the same instant. Each environment handles exactly one invocation at a time. There is no thread pool, no event loop shared across invocations — if two requests arrive simultaneously, AWS spins up two separate environments.
+
+The key formula: **concurrency ≈ RPS × average duration (in seconds)**. At 100 requests/s with a 200 ms average handler duration, you need 100 × 0.2 = **20 concurrent environments**. At 500 ms average, you need 50. At 2 s average, 200 — and so on. Latency optimisation directly reduces your concurrency footprint.
+
+## Account concurrency pool
+
+Every AWS account has a regional concurrency quota — default **1 000 concurrent executions** per region, shared across all functions. When the pool is full, new invocations get throttled (sync → HTTP 429 TooManyRequestsException; async → queued and retried). Raising the limit requires a Service Quotas increase request; AWS typically grants up to 10 000 with a business justification.
+
+This is the single most common production surprise: one function spikes and starves all others in the same region. Reserved concurrency is the fix.
+
+## Types of concurrency
+
+| Type | What it does | Cost | Use for |
+|------|--------------|------|---------|
+| **Unreserved** | Draws from the shared regional pool on demand | Invocation + duration only | Most functions |
+| **Reserved** | Carves a slice of the regional pool exclusively for this function; acts as both a floor and a ceiling | No extra charge | Protecting critical paths from noisy neighbours; throttling cost runaway |
+| **Provisioned** | Pre-warms N environments; they stay initialised 24/7 | PC-hours + invocation | Latency-sensitive functions where cold starts are unacceptable |
+
+## Reserved concurrency edge cases
+
+- Setting reserved concurrency to **0** disables the function entirely — useful as a circuit breaker.
+- Reserved concurrency counts against the account pool even when idle. If you set 500 reserved on a function, only 500 remain for all other functions (at default 1 000).
+- Reserved concurrency does **not** pre-warm. You still cold-start; you just can't scale past the cap.
+
+## Burst scaling
+
+When traffic spikes from zero, Lambda can spin up environments quickly — but not infinitely fast. The burst limit (region-dependent, typically 500–3 000 immediate) is how many environments AWS will create right now. Beyond that, it adds **500 new environments per minute**. A spike from 0 to 5 000 concurrent requests takes several minutes to fully absorb. Provisioned Concurrency or pre-warming via a ping mechanism is the fix for sudden large spikes.
+
+> **Interview answer template:** "Concurrency = RPS × duration. Default pool is 1 000/region. Reserved carves a slice and prevents both starvation and runaway. Provisioned pre-warms to eliminate cold starts, but you pay for idle capacity."