Domain Mapping
+System Overview
+ View Full ++ High-level architecture showing all services, data + stores, and communication patterns. +
+Key Components
-
-
- Machine = Payment Processor -
- Metrics Stream = Transaction Stream -
- Thresholds = Fraud Detection -
- Aggregator = Payment Hub -
gRPC Patterns
--
-
- Client streaming (metrics) -
- Server streaming (config) -
- Bidirectional (control) -
- Health checking -
Event-Driven
--
-
- Redis Pub/Sub (current) -
- Abstraction for Kafka switch -
- Decoupled alert processing -
- Real-time WebSocket push -
Resilience
--
-
- Collectors are independent -
- Graceful degradation -
- Retry with backoff -
- Health checks everywhere -
Technology Stack
-Core
--
-
- Python 3.11+ -
- FastAPI -
- gRPC / protobuf -
- asyncio +
- + Collector: Runs on each monitored + machine, streams metrics via gRPC + +
- + Aggregator: Central gRPC server, + receives streams, normalizes data + +
- + Gateway: FastAPI service, WebSocket + for browser, REST for queries + +
- + Alerts: Subscribes to events, + evaluates thresholds, triggers actions +
Data
--
-
- TimescaleDB -
- Redis -
- Redis Pub/Sub -
Infrastructure
--
-
- Docker -
- Kubernetes -
- Kind + Tilt -
- Terraform -
CI/CD
--
-
- Woodpecker CI -
- Kustomize -
- Container Registry -
Data Flow Pipeline
+ View Full ++ How metrics flow from collection through storage with + different retention tiers. +
+Storage Tiers
+| Tier | +Resolution | +Retention | +Use Case | +
|---|---|---|---|
| Hot (Redis) | +5s | +5 min | +Current state, live dashboard | +
| Raw (TimescaleDB) | +5s | +24h | +Recent detailed analysis | +
| 1-min Aggregates | +1m | +7d | +Week view, trends | +
| 1-hour Aggregates | +1h | +90d | +Long-term analysis | +
Deployment Architecture
+ View Full ++ Deployment options from local development to AWS + production. +
+Environments
+-
+
- + Local Dev: Kind + Tilt for K8s, or + Docker Compose + +
- + Demo (EC2): Docker Compose on + t2.small at sysmonstm.mcrn.ar + +
- + Lambda Pipeline: SQS-triggered + aggregation for data processing experience + +
gRPC Service Definitions
+ View Full +Protocol Buffer service and message definitions.
+Services
+-
+
- + MetricsService: Client-side + streaming for metrics ingestion + +
- + ControlService: Bidirectional + streaming for collector control + +
- + ConfigService: Server-side + streaming for config updates + +
Interview Talking Points
+Domain Mapping
+-
+
- Machine = Payment Processor +
- Metrics Stream = Transaction Stream +
- Thresholds = Fraud Detection +
- Aggregator = Payment Hub +
gRPC Patterns
+-
+
- Client streaming (metrics) +
- Server streaming (config) +
- Bidirectional (control) +
- Health checking +
Event-Driven
+-
+
- Redis Pub/Sub (current) +
- Abstraction for Kafka switch +
- Decoupled alert processing +
- Real-time WebSocket push +
Resilience
+-
+
- Collectors are independent +
- Graceful degradation +
- Retry with backoff +
- Health checks everywhere +
Technology Stack
+Core
+-
+
- Python 3.11+ +
- FastAPI +
- gRPC / protobuf +
- asyncio +
Data
+-
+
- TimescaleDB +
- Redis +
- Redis Pub/Sub +
Infrastructure
+-
+
- Docker +
- Kubernetes +
- Kind + Tilt +
- Terraform +
CI/CD
+-
+
- Woodpecker CI +
- Kustomize +
- Container Registry +
System Monitoring Platform
+Documentation
++ + +
System Overview
+ View Full +High-level architecture showing all services, data stores, and communication patterns.
+Key Components
+-
+
- Collector: Runs on each monitored machine, streams metrics via gRPC +
- Aggregator: Central gRPC server, receives streams, normalizes data +
- Gateway: FastAPI service, WebSocket for browser, REST for queries +
- Alerts: Subscribes to events, evaluates thresholds, triggers actions +
Data Flow Pipeline
+ View Full +How metrics flow from collection through storage with different retention tiers.
+Storage Tiers
+| Tier | Resolution | Retention | Use Case |
|---|---|---|---|
| Hot (Redis) | +5s | +5 min | +Current state, live dashboard | +
| Raw (TimescaleDB) | +5s | +24h | +Recent detailed analysis | +
| 1-min Aggregates | +1m | +7d | +Week view, trends | +
| 1-hour Aggregates | +1h | +90d | +Long-term analysis | +
Deployment Architecture
+ View Full +Deployment options from local development to AWS production.
+Environments
+-
+
- Local Dev: Kind + Tilt for K8s, or Docker Compose +
- Demo (EC2): Docker Compose on t2.small at sysmonstm.mcrn.ar +
- Lambda Pipeline: SQS-triggered aggregation for data processing experience +
gRPC Service Definitions
+ View Full +Protocol Buffer service and message definitions.
+Services
+-
+
- MetricsService: Client-side streaming for metrics ingestion +
- ControlService: Bidirectional streaming for collector control +
- ConfigService: Server-side streaming for config updates +
+ +
Interview Talking Points
+Domain Mapping
+-
+
- Machine = Payment Processor +
- Metrics Stream = Transaction Stream +
- Thresholds = Fraud Detection +
- Aggregator = Payment Hub +
gRPC Patterns
+-
+
- Client streaming (metrics) +
- Server streaming (config) +
- Bidirectional (control) +
- Health checking +
Event-Driven
+-
+
- Redis Pub/Sub (current) +
- Abstraction for Kafka switch +
- Decoupled alert processing +
- Real-time WebSocket push +
Resilience
+-
+
- Collectors are independent +
- Graceful degradation +
- Retry with backoff +
- Health checks everywhere +
Technology Stack
+Core
+-
+
- Python 3.11+ +
- FastAPI +
- gRPC / protobuf +
- asyncio +
Data
+-
+
- TimescaleDB +
- Redis +
- Redis Pub/Sub +
Infrastructure
+-
+
- Docker +
- Kubernetes +
- Kind + Tilt +
- Terraform +
CI/CD
+-
+
- Woodpecker CI +
- Kustomize +
- Container Registry +