update docs

2026-05-11 20:13:11 -03:00
commit 2ffabb672e
40 changed files with 5869 additions and 0 deletions
--- a/docs/lambdas-md/lambda-17-adjacent.md
+++ b/docs/lambdas-md/lambda-17-adjacent.md
@@ -0,0 +1,40 @@
+# Adjacent
+
+> Brief orientation on AWS Glue and Prometheus/Grafana — the secondary gaps from the interview.
+
+## AWS Glue
+
+Glue is a managed Spark-based ETL service. Lambda and Glue solve different problems:
+
+|  | Lambda | Glue |
+|--|--------|------|
+| **Runtime model** | Serverless; up to 15 min; one handler at a time per env | Managed Spark cluster; hours-long jobs; distributed compute |
+| **Data scale** | Up to a few GB comfortably | TB to PB natively |
+| **Language** | Python, Node, Java, Go, custom runtime | PySpark, Scala; Glue Studio for no-code |
+| **Startup time** | Milliseconds (warm) | 1–2 minutes to provision Spark cluster |
+| **Cost model** | Per request + per ms | Per DPU-hour (1 DPU = $0.44/hr); 10-minute minimum billing |
+| **Use for** | Light transforms, event reactions, API backends | Large-scale joins, aggregations, schema inference on data lake |
+
+Key Glue concepts to know: **DynamicFrame** (Glue's DataFrame variant with schema flexibility), **Glue Catalog** (centralised metadata store for table schemas — also used by Athena), **Job Bookmarks** (Glue tracks processed S3 partitions to avoid reprocessing on incremental runs).
+
+The decision is usually straightforward: if the data fits in Lambda's memory and the job finishes in under 15 minutes, use Lambda. If you're joining multiple large S3 datasets or transforming daily partition files, use Glue.
+
+## Prometheus
+
+Prometheus is a pull-based time-series metrics system. It scrapes HTTP `/metrics` endpoints on a schedule. The fundamental tension with Lambda: Lambda functions are ephemeral — there's no persistent HTTP endpoint to scrape, and the function may be at zero concurrency between invocations.
+
+Options for Lambda → Prometheus:
+
+- **EMF → CloudWatch → Grafana CloudWatch plugin** — no Prometheus involved. Grafana reads directly from CloudWatch. Easiest for AWS-native stacks.
+- **Remote write to Amazon Managed Prometheus (AMP)** — the function pushes metrics to AMP via the Prometheus remote_write API at the end of each invocation. Grafana or Amazon Managed Grafana reads from AMP. Requires the `prometheus_client` library and SIGV4 signing on the remote_write request.
+- **Push gateway** — a persistent intermediate that Lambda pushes to; Prometheus scrapes the gateway. More infrastructure to manage, stale metric risk if the push gateway isn't flushed between invocations.
+
+## Grafana
+
+Grafana is a dashboarding layer — it doesn't store data, it queries data sources. Relevant data sources for Lambda observability:
+
+- **CloudWatch** — built-in Grafana plugin; queries CW Metrics and CW Logs Insights. Zero extra infrastructure. The standard choice for Lambda metrics (invocations, errors, duration, throttles, concurrent executions).
+- **Amazon Managed Prometheus** — query via PromQL if you've pushed custom metrics.
+- **Amazon Managed Grafana (AMG)** — Grafana-as-a-service; integrates with AWS IAM; auto-discovers CW namespaces. Avoids self-hosting Grafana.
+
+For a Lambda-only stack with no existing Prometheus investment, the practical answer is: use EMF for custom metrics, use CloudWatch for the built-in Lambda metrics, and connect Grafana to CloudWatch. It requires no extra infrastructure and gives you dashboards in an hour.