simple is better

This commit is contained in:
buenosairesam
2026-01-22 16:22:15 -03:00
parent dc3518f138
commit 82c4551e71
9 changed files with 178 additions and 493 deletions

155
CLAUDE.md
View File

@@ -2,131 +2,90 @@
## Project Overview
A real-time system monitoring platform that streams metrics from multiple machines to a central hub with live web dashboard. Built to demonstrate production microservices patterns (gRPC, FastAPI, streaming, event-driven architecture) while solving a real problem: monitoring development infrastructure across multiple machines.
A real-time system monitoring platform that streams metrics from multiple machines to a central hub with live web dashboard. Built to demonstrate production microservices patterns (gRPC, FastAPI, streaming, event-driven architecture).
**Primary Goal:** Portfolio project demonstrating real-time streaming architecture
**Secondary Goal:** Actually useful tool for monitoring multi-machine development environment
**Status:** Working MVP, deployed at sysmonstm.mcrn.ar
**Primary Goal:** Portfolio project demonstrating real-time streaming with gRPC
**Status:** Working, deployed at sysmonstm.mcrn.ar
## Deployment Modes
### Production (3-tier)
## Architecture
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Collector │────▶│ Hub │────▶│ Edge │
(each host) │ │ (local) │ (AWS) │
└─────────────┘ └─────────────┘ └─────────────┘
┌─────────────┐ ┌─────────────────────────────────────┐ ┌─────────────┐
│ Collector │────▶│ Aggregator + Gateway + Redis + TS │────▶│ Edge │────▶ Browser
(mcrn) │gRPC │ (LOCAL) │ WS │ (AWS) │ WS
└─────────────┘ └─────────────────────────────────────┘ └─────────────┘
┌─────────────┐ │
│ Collector │────────────────────┘
│ (nfrt) │gRPC
└─────────────┘
```
- **Collector** (`ctrl/collector/`) - Lightweight agent on each monitored machine
- **Hub** (`ctrl/hub/`) - Local aggregator, receives from collectors, forwards to edge
- **Edge** (`ctrl/edge/`) - Cloud dashboard, public-facing
### Development (Full Stack)
```bash
docker compose up # Uses ctrl/dev/docker-compose.yml
```
- Full gRPC-based microservices architecture
- Services: aggregator, gateway, collector, alerts
- Storage: Redis (hot), TimescaleDB (historical)
- **Collectors** (`services/collector/`) - gRPC clients on each monitored machine
- **Aggregator** (`services/aggregator/`) - gRPC server, stores in Redis/TimescaleDB
- **Gateway** (`services/gateway/`) - FastAPI, bridges gRPC to WebSocket, forwards to edge
- **Edge** (`ctrl/edge/`) - Simple WebSocket relay for AWS, serves public dashboard
## Directory Structure
```
sms/
├── services/ # gRPC-based microservices (dev stack)
├── services/ # gRPC-based microservices
│ ├── collector/ # gRPC client, streams to aggregator
│ ├── aggregator/ # gRPC server, stores in Redis/TimescaleDB
│ ├── gateway/ # FastAPI, bridges gRPC to WebSocket
│ ├── gateway/ # FastAPI, WebSocket, forwards to edge
│ └── alerts/ # Event subscriber for threshold alerts
├── ctrl/ # Deployment configurations
│ ├── collector/ # Lightweight WebSocket collector
── hub/ # Local aggregator
│ ├── edge/ # Cloud dashboard
│ └── dev/ # Full stack docker-compose
│ ├── dev/ # Full stack docker-compose
── edge/ # Cloud dashboard (AWS)
├── proto/ # Protocol Buffer definitions
├── shared/ # Shared Python modules
── web/ # Dashboard templates and static files
├── infra/ # Terraform for AWS deployment
└── k8s/ # Kubernetes manifests
├── shared/ # Shared Python modules (config, logging, events)
── web/ # Dashboard templates and static files
```
## Current Setup
## Running
**Machines being monitored:**
- `mcrn` - Primary workstation (runs hub + collector)
- `nfrt` - Secondary machine (runs collector only)
**Topology:**
### Local Development
```bash
docker compose up
```
mcrn nfrt AWS
├── hub ◄─────────────────── collector edge (sysmonstm.mcrn.ar)
│ │ ▲
│ └────────────────────────────────────────────────┘
└── collector
### With Edge Forwarding (to AWS)
```bash
EDGE_URL=wss://sysmonstm.mcrn.ar/ws docker compose up
```
### Collector on Remote Machine
```bash
docker run -d --network host \
-e AGGREGATOR_URL=<local-ip>:50051 \
-e MACHINE_ID=$(hostname) \
registry.mcrn.ar/sysmonstm/collector:latest
```
## Technical Stack
### Core Technologies
- **Python 3.11+** - Primary language
- **FastAPI** - Web gateway, REST endpoints, WebSocket streaming
- **gRPC** - Inter-service communication (dev stack)
- **WebSockets** - Production deployment communication
- **psutil** - System metrics collection
- **Python 3.11+**
- **gRPC** - Collector to aggregator communication (showcased)
- **FastAPI** - Gateway REST/WebSocket
- **Redis** - Pub/Sub events, current state cache
- **TimescaleDB** - Historical metrics storage
- **WebSocket** - Gateway to edge, edge to browser
### Storage (Dev Stack Only)
- **PostgreSQL/TimescaleDB** - Time-series historical data
- **Redis** - Current state, caching, event pub/sub
## Key Files
### Infrastructure
- **Docker Compose** - Orchestration
- **Woodpecker CI** - Build pipeline at ppl/pipelines/sysmonstm/
- **Registry** - registry.mcrn.ar/sysmonstm/
| File | Purpose |
|------|---------|
| `proto/metrics.proto` | gRPC service and message definitions |
| `services/collector/main.py` | gRPC streaming client |
| `services/aggregator/main.py` | gRPC server, metric processing |
| `services/gateway/main.py` | WebSocket bridge, edge forwarding |
| `ctrl/edge/edge.py` | Simple WebSocket relay for AWS |
## Images
## Portfolio Talking Points
| Image | Purpose |
|-------|---------|
| `collector` | Lightweight WebSocket collector for production |
| `hub` | Local aggregator for production |
| `edge` | Cloud dashboard for production |
| `aggregator` | gRPC aggregator (dev stack) |
| `gateway` | FastAPI gateway (dev stack) |
| `collector-grpc` | gRPC collector (dev stack) |
| `alerts` | Alert service (dev stack) |
## Development Guidelines
### Code Quality
- Type hints throughout (Python 3.11+ syntax)
- Async/await patterns consistently
- Logging (not print statements)
- Error handling at boundaries
### Docker
- Multi-stage builds for smaller images
- `--network host` for collectors (accurate network metrics)
### Configuration
- Environment variables for all config
- Sensible defaults
- No secrets in code
## Interview/Portfolio Talking Points
### Architecture Decisions
- "3-tier for production: collector → hub → edge"
- "Hub allows local aggregation and buffering before forwarding to cloud"
- "Edge terminology shows awareness of edge computing patterns"
- "Full gRPC stack for development demonstrates microservices patterns"
### Trade-offs
- Production vs Dev: simplicity/cost vs full architecture demo
- WebSocket vs gRPC: browser compatibility vs efficiency
- In-memory vs persistent: operational simplicity vs durability
- **gRPC streaming** - Efficient binary protocol for real-time metrics
- **Event-driven** - Redis Pub/Sub decouples processing from delivery
- **Edge pattern** - Heavy processing local, lightweight relay in cloud
- **Cost optimization** - ~$10/mo for public dashboard (data transfer, not requests)