Mental Model

Lambda is a Linux process whose lifecycle is managed for you. Most of the surprise comes from forgetting that it's still a process.

What Lambda actually is

Each invocation runs inside an execution environment: a Firecracker microVM running the Lambda runtime (e.g. python3.13), with your code unpacked into /var/task and an ephemeral /tmp. AWS owns the VM; you own everything inside the process. The microVM is created on demand, kept warm for a while, then torn down when idle traffic stops feeding it. You don't pick a server, but there is a server, and it has memory, a clock, and a filesystem.

The two phases

Every cold start splits cleanly into two:

Init phase — your module-level code runs once: imports, client construction, anything outside the handler function. Capped at 10 s. Billed at full configured memory. The os.environ reads at the top of lambda_function.py happen here.
Handler phase — handler(event, context) runs once per invocation. Billed per-millisecond at configured memory. Subsequent invocations on the same environment skip the init phase and go straight here.

This split is the single most useful thing to internalise. Heavy work at module level → pay it once per cold start. Heavy work inside the handler → pay it every invocation.

Globals persist across warm invocations

Anything assigned at module scope survives between handler calls on the same environment. That includes the boto3 client (good — connection reuse, TCP keep-alive, no re-handshake) and any in-memory cache you build (good — but be careful, see Pitfalls). It also includes mutations you didn't mean to keep, like a list you appended to without thinking. The same warm container can serve thousands of invocations in a row, then disappear.

# module level — runs once per cold start, reused across warm invocations
BUCKET   = os.environ["BUCKET_NAME"]
ENDPOINT = os.environ.get("S3_ENDPOINT_URL")

# handler level — runs every invocation
def handler(event, context):
    return asyncio.run(_run())

/tmp is real but local

Each environment has its own /tmp (default 512 MB, configurable to 10 GB). It persists across warm invocations on that environment, so you can stash artefacts you'd rather not rebuild — but it is not shared between concurrent executions, and it's gone when the environment dies. lambda_function.py writes /tmp/<uuid>.jsonl per invocation and uploads it to S3 at the end; the file then becomes garbage, and the next invocation starts fresh.

Concurrency is horizontal

If two events arrive while one is being processed, AWS spins up a second execution environment. Each environment processes one invocation at a time, single-threaded relative to your handler. The "concurrency" you see in CloudWatch is the count of environments running in parallel. There is no thread pool to tune. There is no shared memory between environments. If you need shared state, externalise it (DynamoDB, Redis, S3).

The reuse window

Idle environments stick around for roughly 5–15 minutes (AWS doesn't promise a number) before being recycled. That's why a function that sees one request a minute almost never cold-starts, and a function that sees one a day always does. Cold Starts covers what that costs and how to mitigate it.

Lifecycle

Init is paid once, handler is paid every time. Freeze/thaw is free. Shutdown happens when nobody's looking.

3.5 KiB Raw Blame History Unescape Escape