update docs
This commit is contained in:
35
docs/lambdas-md/lambda-05-concurrency.md
Normal file
35
docs/lambdas-md/lambda-05-concurrency.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Concurrency
|
||||
|
||||
> Account quota, reserved, provisioned. The "100 RPS × 200 ms" math.
|
||||
|
||||
## The fundamental model
|
||||
|
||||
Lambda concurrency = the number of execution environments processing requests at the same instant. Each environment handles exactly one invocation at a time. There is no thread pool, no event loop shared across invocations — if two requests arrive simultaneously, AWS spins up two separate environments.
|
||||
|
||||
The key formula: **concurrency ≈ RPS × average duration (in seconds)**. At 100 requests/s with a 200 ms average handler duration, you need 100 × 0.2 = **20 concurrent environments**. At 500 ms average, you need 50. At 2 s average, 200 — and so on. Latency optimisation directly reduces your concurrency footprint.
|
||||
|
||||
## Account concurrency pool
|
||||
|
||||
Every AWS account has a regional concurrency quota — default **1 000 concurrent executions** per region, shared across all functions. When the pool is full, new invocations get throttled (sync → HTTP 429 TooManyRequestsException; async → queued and retried). Raising the limit requires a Service Quotas increase request; AWS typically grants up to 10 000 with a business justification.
|
||||
|
||||
This is the single most common production surprise: one function spikes and starves all others in the same region. Reserved concurrency is the fix.
|
||||
|
||||
## Types of concurrency
|
||||
|
||||
| Type | What it does | Cost | Use for |
|
||||
|------|--------------|------|---------|
|
||||
| **Unreserved** | Draws from the shared regional pool on demand | Invocation + duration only | Most functions |
|
||||
| **Reserved** | Carves a slice of the regional pool exclusively for this function; acts as both a floor and a ceiling | No extra charge | Protecting critical paths from noisy neighbours; throttling cost runaway |
|
||||
| **Provisioned** | Pre-warms N environments; they stay initialised 24/7 | PC-hours + invocation | Latency-sensitive functions where cold starts are unacceptable |
|
||||
|
||||
## Reserved concurrency edge cases
|
||||
|
||||
- Setting reserved concurrency to **0** disables the function entirely — useful as a circuit breaker.
|
||||
- Reserved concurrency counts against the account pool even when idle. If you set 500 reserved on a function, only 500 remain for all other functions (at default 1 000).
|
||||
- Reserved concurrency does **not** pre-warm. You still cold-start; you just can't scale past the cap.
|
||||
|
||||
## Burst scaling
|
||||
|
||||
When traffic spikes from zero, Lambda can spin up environments quickly — but not infinitely fast. The burst limit (region-dependent, typically 500–3 000 immediate) is how many environments AWS will create right now. Beyond that, it adds **500 new environments per minute**. A spike from 0 to 5 000 concurrent requests takes several minutes to fully absorb. Provisioned Concurrency or pre-warming via a ping mechanism is the fix for sudden large spikes.
|
||||
|
||||
> **Interview answer template:** "Concurrency = RPS × duration. Default pool is 1 000/region. Reserved carves a slice and prevents both starvation and runaway. Provisioned pre-warms to eliminate cold starts, but you pay for idle capacity."
|
||||
Reference in New Issue
Block a user