Files
lambda_local_runner/docs/lambdas-md/lambda-18-labs.md
2026-05-11 20:13:11 -03:00

4.5 KiB

Labs

Hands-on walkthroughs that modify the existing app. Each mutates what you already have — no throw-away exercises.

Lab 0 — Local sandbox (start here)

Goal: run the full stack locally against MinIO with real PDFs.

  1. make install — creates .venv and installs deps
  2. make up — starts MinIO on :9000 (API) and :9001 (console)
  3. SOURCE_DIR=~/path/to/pdfs make seed — uploads PDFs to MinIO bucket
  4. make invoke — runs invoke.py which calls handler() with a minimal event
  5. Open http://localhost:9001 (minioadmin/minioadmin) and find the generated manifest in the manifests/ prefix

What you can break: set PREFIX to a non-existent prefix and observe the handler returns count=0. Set QUEUE_MAX=1 and observe the backpressure on the producer. Remove S3_ENDPOINT_URL and watch it fail to connect.

Lab 1 — Deploy to real AWS

Goal: package and deploy the function to AWS Lambda, invoke it against a real S3 bucket.

  1. Create an S3 bucket and upload sample PDFs to 2026/04/ prefix
  2. Create an IAM execution role with s3:GetObject, s3:PutObject, s3:ListBucket, and logs:*
  3. Build the deployment zip inside the Lambda image: docker run --rm -v $PWD:/var/task public.ecr.aws/lambda/python:3.13 pip install -r requirements.txt -t package/
  4. Create the function: aws lambda create-function --handler lambda_function.handler …
  5. Invoke: aws lambda invoke --function-name pdf-scanner --payload '{}' out.json
  6. Verify the manifest appeared in S3 and the presigned URL works

What you can break: invoke without s3:ListBucket on the bucket (not the object ARN) — observe AccessDenied. Watch CloudTrail to see the denied call.

Lab 2 — Add an S3 trigger

Goal: make the function fire automatically when a PDF is uploaded.

  1. Add a resource policy entry granting S3 lambda:InvokeFunction
  2. Configure an S3 event notification on the bucket for s3:ObjectCreated:* filtered to *.pdf
  3. Upload a PDF and check CloudWatch Logs for the invocation
  4. Notice the event structure differs from the manual invoke — update the handler to extract the key from event["Records"][0]["s3"]["object"]["key"]

What you can break: upload a non-PDF to the same prefix and verify the filter prevents invocation. Remove the resource policy and verify the trigger silently stops firing (no error to the uploader — this is the async invocation model).

Lab 3 — Switch to arm64

Goal: migrate to Graviton2 and verify 20% cost reduction.

  1. Rebuild the zip using the arm64 Lambda image: public.ecr.aws/lambda/python:3.13-arm64
  2. Update the function architecture: aws lambda update-function-configuration --architectures arm64
  3. Update the function code with the arm64 zip
  4. Invoke and compare REPORT duration and billed duration in CloudWatch

What you can break: try deploying the x86 zip against the arm64 architecture — the function will import-error on any C-extension wheels.

Lab 4 — Enable Provisioned Concurrency

Goal: eliminate cold starts on the production alias.

  1. Publish version 1: aws lambda publish-version --function-name pdf-scanner
  2. Create alias prod pointing to version 1
  3. Enable PC: aws lambda put-provisioned-concurrency-config --function-name pdf-scanner --qualifier prod --provisioned-concurrent-executions 2
  4. Invoke via the alias ARN and confirm Init Duration is absent from REPORT lines
  5. Check your AWS bill after 1 hour — note the PC charges

Lab 5 — Add X-Ray tracing

Goal: see a trace with S3 subsegments in the X-Ray console.

  1. Add aws-xray-sdk to requirements.txt and rebuild the zip
  2. Add to lambda_function.py: from aws_xray_sdk.core import patch_all; patch_all()
  3. Enable active tracing on the function and add X-Ray permissions to the execution role
  4. Invoke and open X-Ray → Traces in the console — verify S3 list_objects_v2 and generate_presigned_url appear as subsegments

Lab 6 — Fan out with Step Functions

Goal: process multiple S3 prefixes in parallel using a Map state.

  1. Update the handler to accept a prefix key in the event (instead of reading from env var)
  2. Create a Step Functions state machine with a Map state that iterates over a list of prefixes and invokes the Lambda for each
  3. Start an execution with input: {"prefixes": ["2026/01/", "2026/02/", "2026/03/"]}
  4. Observe parallel Lambda invocations in the execution graph and CloudWatch
  5. Add error handling: configure the Map state to catch Lambda errors and continue rather than fail the whole execution