bootstrap SAM stack — S3 bucket + stub Lambda

This commit is contained in:
2026-05-18 05:21:46 -03:00
parent 3629d1183b
commit 5bfabae7a5
20 changed files with 753 additions and 241 deletions

View File

@@ -1,8 +1,8 @@
# sign_pdfs_optimized — Walkthrough
# sign_pdfs — Walkthrough
> Production-refined fork of `sign_pdfs`. Applies 8 improvements at once; meant to be compared side-by-side with the original in the local tester.
> The production handler. 8 refinements over the historical baseline (`functions/sign_pdfs_v1/`), all applied together so the tester can flip between the two and show the difference live.
The original `sign_pdfs` is the "before" snapshot — it is left untouched. This file (`sign_pdfs_optimized/handler.py`) is the "after": same contract, same output shape, all the rough edges from the original fixed.
`sign_pdfs_v1/` is the historical baseline — kept untouched in the repo so the local tester dropdown can compare "before" and "after" side-by-side. This file (`functions/sign_pdfs/handler.py`) is what gets deployed to AWS (function name `eth-demo-sign-pdfs`): same contract, same output shape, all the rough edges from v1 fixed.
---
@@ -56,7 +56,7 @@ def _emit_emf(metrics: dict, **dims):
"_aws": {
"Timestamp": int(time.time() * 1000),
"CloudWatchMetrics": [{
"Namespace": "eth/sign_pdfs_optimized",
"Namespace": "eth/sign_pdfs",
"Dimensions": [list(dims.keys())],
"Metrics": [
{"Name": k, "Unit": "Bytes" if k.endswith("Bytes") else "Count"}
@@ -195,7 +195,7 @@ _emit_emf({
"PDFsProcessed": count, "S3ListPages": pages,
"PresignCount": count, "ManifestBytes": manifest_bytes,
"ResponseBytes": response_bytes,
}, Function="sign_pdfs_optimized")
}, Function="sign_pdfs")
return result
```
@@ -238,11 +238,17 @@ def handler(event, context):
---
## Further improvements (not yet applied)
## Further improvements
These are the next natural steps if this function were going to production. They were left out intentionally — each adds infrastructure or AWS-side configuration that goes beyond the handler itself.
The next natural production steps. Status markers below reflect what is already declared in this repo's `template.yaml` vs what is left for later.
### Idempotency
| Status | Meaning |
|---|---|
| `[DONE]` | Already applied in `template.yaml` |
| `[PLANNED]` | Not yet applied but on the roadmap for this stack |
| `[NOT IN SCOPE]` | Doesn't apply to this invocation model |
### Idempotency — `[PLANNED]`
The function is not idempotent in the strict sense. Each invocation with the same `(bucket, prefix)` event produces a new manifest at a new UUID key. If Lambda retries the invocation (async invocations retry up to 2 times by default, and S3/SNS/EventBridge are at-least-once), you accumulate duplicate manifests in the `manifests/` prefix.
@@ -280,9 +286,9 @@ AWS PowerTools for Lambda has a built-in `@idempotent` decorator that implements
**What it requires:** a DynamoDB table, `dynamodb:GetItem` + `dynamodb:PutItem` IAM permissions on the execution role, and the PowerTools layer (or `aws-lambda-powertools` in requirements).
### Manifest lifecycle rule
### Manifest lifecycle rule`[DONE — template.yaml]`
Every invocation writes a new object under `manifests/`. Without cleanup, this prefix grows unbounded. The fix is an S3 lifecycle rule on the bucket — not a handler change:
Every invocation writes a new object under `manifests/`. Without cleanup, this prefix grows unbounded. The fix is an S3 lifecycle rule on the bucket — not a handler change. The rule is declared on the `ReportsBucket` resource in `template.yaml` (verify in the S3 console under Properties → Lifecycle rules; entry `expire-manifests`):
```json
{
@@ -303,7 +309,7 @@ Objects under `manifests/` are deleted by S3 automatically after 1 day. The pres
With both in place: a retry returns the same `manifest_url` pointing to the same (still-live) manifest object; after 24 hours the manifest is gone and the dedup record has expired, so the next invocation starts fresh. The combination is clean.
### ReportBatchItemFailures (SQS only)
### ReportBatchItemFailures (SQS only)`[NOT IN SCOPE]`
If this function were triggered by an SQS event source mapping (one message = one `(bucket, prefix)` job), the consumer-level `errors` field isn't enough — Lambda needs to know *which SQS messages* failed so it can re-queue only those. Return a `batchItemFailures` list instead of raising:
@@ -323,9 +329,9 @@ Without this, a single failed message causes the entire batch to retry, includin
**What it requires:** `ReportBatchItemFailures` enabled on the ESM configuration (CDK/Terraform) and restructuring the handler to iterate over `event["Records"]`. Not applicable to direct (RequestResponse) invocations like the local tester uses.
### arm64 / Graviton2
### arm64 / Graviton2`[DONE — template.yaml]`
No code change needed. Switch the function's architecture to `arm64` in the deployment config:
No code change needed. Switch the function's architecture to `arm64` in the deployment config. Declared once under `Globals.Function.Architectures` in `template.yaml` so it applies to every function the stack adds later:
```yaml
# SAM