41 lines
3.2 KiB
Markdown
41 lines
3.2 KiB
Markdown
# Adjacent
|
||
|
||
> Brief orientation on AWS Glue and Prometheus/Grafana — the secondary gaps from the interview.
|
||
|
||
## AWS Glue
|
||
|
||
Glue is a managed Spark-based ETL service. Lambda and Glue solve different problems:
|
||
|
||
| | Lambda | Glue |
|
||
|--|--------|------|
|
||
| **Runtime model** | Serverless; up to 15 min; one handler at a time per env | Managed Spark cluster; hours-long jobs; distributed compute |
|
||
| **Data scale** | Up to a few GB comfortably | TB to PB natively |
|
||
| **Language** | Python, Node, Java, Go, custom runtime | PySpark, Scala; Glue Studio for no-code |
|
||
| **Startup time** | Milliseconds (warm) | 1–2 minutes to provision Spark cluster |
|
||
| **Cost model** | Per request + per ms | Per DPU-hour (1 DPU = $0.44/hr); 10-minute minimum billing |
|
||
| **Use for** | Light transforms, event reactions, API backends | Large-scale joins, aggregations, schema inference on data lake |
|
||
|
||
Key Glue concepts to know: **DynamicFrame** (Glue's DataFrame variant with schema flexibility), **Glue Catalog** (centralised metadata store for table schemas — also used by Athena), **Job Bookmarks** (Glue tracks processed S3 partitions to avoid reprocessing on incremental runs).
|
||
|
||
The decision is usually straightforward: if the data fits in Lambda's memory and the job finishes in under 15 minutes, use Lambda. If you're joining multiple large S3 datasets or transforming daily partition files, use Glue.
|
||
|
||
## Prometheus
|
||
|
||
Prometheus is a pull-based time-series metrics system. It scrapes HTTP `/metrics` endpoints on a schedule. The fundamental tension with Lambda: Lambda functions are ephemeral — there's no persistent HTTP endpoint to scrape, and the function may be at zero concurrency between invocations.
|
||
|
||
Options for Lambda → Prometheus:
|
||
|
||
- **EMF → CloudWatch → Grafana CloudWatch plugin** — no Prometheus involved. Grafana reads directly from CloudWatch. Easiest for AWS-native stacks.
|
||
- **Remote write to Amazon Managed Prometheus (AMP)** — the function pushes metrics to AMP via the Prometheus remote_write API at the end of each invocation. Grafana or Amazon Managed Grafana reads from AMP. Requires the `prometheus_client` library and SIGV4 signing on the remote_write request.
|
||
- **Push gateway** — a persistent intermediate that Lambda pushes to; Prometheus scrapes the gateway. More infrastructure to manage, stale metric risk if the push gateway isn't flushed between invocations.
|
||
|
||
## Grafana
|
||
|
||
Grafana is a dashboarding layer — it doesn't store data, it queries data sources. Relevant data sources for Lambda observability:
|
||
|
||
- **CloudWatch** — built-in Grafana plugin; queries CW Metrics and CW Logs Insights. Zero extra infrastructure. The standard choice for Lambda metrics (invocations, errors, duration, throttles, concurrent executions).
|
||
- **Amazon Managed Prometheus** — query via PromQL if you've pushed custom metrics.
|
||
- **Amazon Managed Grafana (AMG)** — Grafana-as-a-service; integrates with AWS IAM; auto-discovers CW namespaces. Avoids self-hosting Grafana.
|
||
|
||
For a Lambda-only stack with no existing Prometheus investment, the practical answer is: use EMF for custom metrics, use CloudWatch for the built-in Lambda metrics, and connect Grafana to CloudWatch. It requires no extra infrastructure and gives you dashboards in an hour.
|