— state per worker
- │ idle → processing → idle → ... → stopped
- ├─ stats: PipelineStats
- │ total_chunks, processed, failed, retries, elapsed, throughput_mbps, queue_size
- ├─ errors: ErrorEntry[] — every event containing an error field
- └─ queueSize: number — last seen queue_size value
-
- Renders:
- ├─ ChunkGrid — colored cells per chunk (pending/queued/processing/done/error)
- ├─ QueueGauge — current queue depth / max
- ├─ WorkerPanel — per-worker state + current chunk assignment
- ├─ StatsPanel — elapsed time, throughput, processed/failed counts
- ├─ ErrorLog — scrollable error list
- └─ OutputFiles — download links (when done)
-```
-
-**Files:** `ui/chunker/src/hooks/useGrpcStream.ts`, `ui/chunker/src/App.tsx`
-
----
-
-## Step 8: Output File Access (after pipeline completes)
-
-```
-App.tsx useEffect([done, jobId]):
- → api.ts: getChunkOutputFiles(jobId)
- → POST /graphql → graphql.py: chunk_output_files(job_id)
- → Reads /app/media/out/chunks/{job_id}/ directory listing from disk
- → Returns [{key, size, url: "/media/out/chunks/{job_id}/chunk_0001.mp4"}]
- → Browser renders download links
- → Click link → nginx /media/out/ → alias /app/media/out/ → serves file from disk
-```
-
-Chunks are written directly to `media/out/chunks/{job_id}/` by the ffmpeg processor — no MinIO upload needed for output. Nginx serves them with `autoindex on`.
-
-**Files:** `core/api/graphql.py`, `core/jobs/handlers/chunk.py`, `ctrl/nginx.conf`
-
----
-
-## Event Types Reference
-
-| Event | Source | Key Fields |
-|-------|--------|------------|
-| `pipeline_start` | Pipeline.run() | source, chunk_duration, num_workers, processor_type |
-| `pipeline_info` | Pipeline.run() | file_size, source_duration, total_chunks |
-| `pipeline_progress` | Monitor thread (500ms) | elapsed, throughput_mbps |
-| `chunk_queued` | Producer thread | sequence, start_time, end_time, duration, queue_size |
-| `chunk_processing` | Worker thread | sequence, worker_id, state, queue_size |
-| `chunk_done` | Worker thread | sequence, processing_time, retries, queue_size |
-| `chunk_retry` | Worker thread | sequence, attempt, backoff |
-| `chunk_error` | Worker thread | sequence, error, retries |
-| `chunk_collected` | ResultCollector | sequence, buffered, emitted |
-| `worker_status` | Worker thread | worker_id, state (idle/processing/stopped) |
-| `pipeline_complete` | Pipeline.run() | total_chunks, processed, failed, elapsed, throughput_mbps |
-| `pipeline_error` | Pipeline.run() | error |
-
----
-
-## Thread Model (inside Celery worker)
-
-```
-Celery worker process
- └─ run_job task thread
- └─ Pipeline.run()
- ├─ Producer thread — enqueues chunks
- ├─ Monitor thread — emits progress every 500ms
- ├─ Worker thread 0 — pulls from queue, processes
- ├─ Worker thread 1 — pulls from queue, processes
- ├─ Worker thread 2 — pulls from queue, processes
- └─ Worker thread 3 — pulls from queue, processes
-```
-
-All threads share the same `event_callback` → `event_bridge` → `push_event()`, which creates a new Redis connection per call. Thread-safe via Redis atomic RPUSH.
-
----
-
-## Infrastructure
-
-| Service | Port | Role |
-|---------|------|------|
-| nginx | 80 | Reverse proxy, static file serving |
-| fastapi | 8702 | GraphQL API (Strawberry) |
-| celery | — | Task worker (runs pipeline) |
-| redis | 6379 | Event bus + Celery broker |
-| grpc | 50051 | gRPC server (StreamChunkPipeline) |
-| envoy | 8090 | gRPC-Web ↔ native gRPC translation |
-| minio | 9000 | S3-compatible source media storage |
-| postgres | 5432 | Job/asset metadata |
diff --git a/docs/architecture/05-detection-pipeline.md b/docs/architecture/05-detection-pipeline.md
new file mode 100644
index 0000000..e67686f
--- /dev/null
+++ b/docs/architecture/05-detection-pipeline.md
@@ -0,0 +1,145 @@
+# Detection Pipeline — Execution Path
+
+## Overview
+
+A pipeline run is a sequence of named **stages** that read and write a shared
+`DetectState` dict. Stages are defined in `core/detect/stages/`; the orchestrator
+(`core/detect/graph/runner.py`) flattens the profile's `PipelineConfig` graph into a
+linear order, runs each stage, and emits SSE events to the browser.
+
+The full stage list is in `core/detect/graph/nodes.py`:
+
+```
+extract_frames → filter_scenes
+ → field_segmentation → detect_edges
+ → detect_objects → preprocess → run_ocr
+ → match_brands → escalate_vlm → escalate_cloud
+ → compile_report
+```
+
+See `03-detection-pipeline.svg` for the graph view.
+
+## Profile
+
+A `Profile` row in Postgres holds two JSONB blobs:
+
+- `pipeline` — a `PipelineConfig` (stages + edges + routing rules) defining topology
+- `configs` — `{stage_name: {...}}` per-stage parameters (fps, thresholds, prompts, ...)
+
+Profiles are the config mechanism: **duplicate a profile and tweak it** instead of
+patching defaults. `core/detect/profile.py` loads profiles by name; `_load_profile()`
+in `nodes.py` merges the job's `config_overrides` on top.
+
+## Stage runner
+
+`PipelineRunner` (in `core/detect/graph/runner.py`) iterates the flattened stages and
+between each one checks three control flags (all keyed by `job_id`):
+
+- **cancel** — `set_cancel_check(job_id, fn)`; raises `PipelineCancelled` to abort
+- **pause / resume** — a `threading.Event` per job; `_wait_if_paused()` blocks
+- **step** — like resume but auto-pauses after the next stage completes
+- **pause-after-stage** — toggle to step through every stage
+
+Each stage runs inside `trace_node(state, name)` (sets a span used by tracing) and
+emits `running` → `done` (or `skipped`) transitions via `core/detect/emit.py`.
+
+## Inference: GPU-host indirection
+
+`core/detect/graph/nodes.py` reads `INFERENCE_URL` from the environment and passes it
+to every CV/ML stage:
+
+- `INFERENCE_URL=""` (default in dev) — stages call CV/ML routines in-process
+- `INFERENCE_URL=http://gpu-host:8000` — stages POST to the GPU server
+ (`core/gpu/server.py`) which exposes `/detect`, `/ocr`, `/preprocess`, `/vlm`,
+ `/detect_edges`, `/segment_field` (each with a `/debug` variant that returns
+ intermediate masks for the overlay viewer)
+
+Memory note: dev and GPU machines are separate boxes on the same LAN; inference is a
+network call. Heavy ML deps (`torch`, `transformers`, `paddleocr`) live only in
+`core/gpu/pyproject.toml` — the API host doesn't need them.
+
+## Browser-side CV (OpenCV WASM)
+
+Some stages (notably the field/edge stages) can run in the browser via OpenCV WASM
+(`ui/detection-app/src/cv/wasmBridge.ts`) for fast iteration without a round trip to
+the GPU host. The browser UI is the test surface for the "replay loop" — change a
+config, replay one stage, see the overlay. Browser CV uses OpenCV WASM directly; there
+are no TypeScript ports of the algorithms.
+
+## Cloud VLM escalation
+
+`escalate_vlm` (local VLM on GPU host) and `escalate_cloud` (Anthropic / Gemini /
+OpenAI / Groq via `core/detect/providers/`) are the last-resort branches for
+unresolved candidates from `match_brands`. Skip flags:
+
+- `SKIP_VLM=1` — emits `skipped` for `escalate_vlm`
+- `SKIP_CLOUD=1` — emits `skipped` for `escalate_cloud`
+
+## Checkpoints, StageOutput, and replay
+
+Two tables back the replay loop:
+
+- **Checkpoint** (`core/db/models.py:Checkpoint`) — a tree node:
+ `(parent_id, stage_name, config_overrides, stats)`. No blobs. Lets the UI show a
+ branching history of "what configs did we try at this stage?"
+- **StageOutput** — a flat upsert table keyed by `(job_id, stage_name)` holding the
+ stage's output dict. `replay-stage` reads upstream outputs from here so a single
+ stage can be re-run without rerunning the whole pipeline.
+
+API surface (`core/api/detect/replay.py`):
+
+- `GET /checkpoints/{timeline_id}` — full tree
+- `POST /replay` — clone a checkpoint into a new job, run from a chosen stage
+- `POST /replay-stage` — re-run one stage in place using upstream `StageOutput` rows
+- `GET /overlays/{timeline_id}/{job_id}/{stage}/{seq}` — debug overlays from MinIO
+
+## Event flow (SSE)
+
+Stages call `emit.transition(...)` / `emit.log(...)` / `emit.boxes(...)` etc.
+(`core/detect/emit.py`). These push into Redis (`core/detect/events.py`). The SSE
+endpoint `GET /detect/stream/{job_id}` (`core/api/detect/sse.py`) drains the Redis
+list and writes to the open SSE response. Envoy keeps the connection open for up to
+3600s (see `ctrl/k8s/base/envoy.yaml`).
+
+```
+stage code
+ → emit.* (core/detect/emit.py)
+ → push_detect_event → Redis RPUSH
+ → [poll] /detect/stream/{job_id} → SSE chunk
+ → fetch ReadableStream in detection-app
+ → Pinia store update → Vue panel re-render
+```
+
+## Pipeline control endpoints
+
+All under `core/api/detect/run.py`:
+
+- `POST /run` — start a job from a timeline + profile
+- `POST /stop/{job_id}` — cancel
+- `POST /pause/{job_id}` / `POST /resume/{job_id}`
+- `POST /step/{job_id}` — run one stage and pause
+- `POST /pause-after-stage/{job_id}` — toggle pause-after-each-stage
+- `GET /status/{job_id}` — current stage, progress
+- `POST /clear/{job_id}` — discard runtime state
+
+## Where the chunker UI fits
+
+`ui/chunker/` is a **standalone testing utility** for the source-chunking step (split
+a long source video into chunks the user picks for a Timeline). It is **not** a
+pipeline stage and is not part of the detection flow. The detection pipeline reads
+already-chunked sources from MinIO via `core/api/detect/sources.py`.
+
+## Files
+
+| Concern | File |
+|---|---|
+| Stage list | `core/detect/graph/nodes.py` |
+| Runner (cancel/pause/resume) | `core/detect/graph/runner.py` |
+| Profile loading | `core/detect/profile.py` |
+| Event emission | `core/detect/emit.py`, `core/detect/events.py` |
+| SSE endpoint | `core/api/detect/sse.py` |
+| Replay API | `core/api/detect/replay.py` |
+| Checkpoint storage | `core/detect/checkpoint/storage.py` |
+| GPU server | `core/gpu/server.py` |
+| Browser CV bridge | `ui/detection-app/src/cv/wasmBridge.ts` |
+| Cloud VLM providers | `core/detect/providers/` |
diff --git a/docs/architecture/styles.css b/docs/architecture/styles.css
deleted file mode 100644
index b3094f2..0000000
--- a/docs/architecture/styles.css
+++ /dev/null
@@ -1,209 +0,0 @@
-:root {
- --bg-color: #1a1a2e;
- --text-color: #e8e8e8;
- --accent-color: #4a90d9;
- --border-color: #333;
- --sidebar-width: 220px;
- --sidebar-bg: #151528;
-}
-
-* {
- box-sizing: border-box;
- margin: 0;
- padding: 0;
-}
-
-body {
- font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
- background-color: var(--bg-color);
- color: var(--text-color);
- line-height: 1.6;
-}
-
-/* Sidebar navigation */
-.sidebar {
- position: fixed;
- top: 0;
- left: 0;
- width: var(--sidebar-width);
- height: 100vh;
- background: var(--sidebar-bg);
- border-right: 1px solid var(--border-color);
- padding: 1.5rem 1rem;
- overflow-y: auto;
- z-index: 10;
-}
-
-.sidebar h2 {
- font-size: 1.2rem;
- color: var(--accent-color);
- margin-bottom: 1.5rem;
- padding-bottom: 0.5rem;
- border-bottom: 1px solid var(--border-color);
-}
-
-.sidebar ul {
- list-style: none;
- display: flex;
- flex-direction: column;
- gap: 0.25rem;
-}
-
-.sidebar li {
- display: block;
-}
-
-.sidebar a {
- display: block;
- padding: 0.4rem 0.6rem;
- color: var(--text-color);
- text-decoration: none;
- font-size: 0.85rem;
- border-radius: 4px;
- transition: background 0.15s, color 0.15s;
-}
-
-.sidebar a:hover {
- background: rgba(74, 144, 217, 0.15);
- color: var(--accent-color);
-}
-
-/* Main content */
-.content {
- margin-left: var(--sidebar-width);
- padding: 2rem;
-}
-
-h1 {
- font-size: 2rem;
- margin-bottom: 1rem;
- color: var(--accent-color);
-}
-
-.content > h2 {
- font-size: 1.5rem;
- margin: 2rem 0 1rem;
- color: var(--text-color);
- border-bottom: 1px solid var(--border-color);
- padding-bottom: 0.5rem;
- scroll-margin-top: 1rem;
-}
-
-.diagram-container {
- display: flex;
- flex-wrap: wrap;
- gap: 2rem;
- margin-top: 1rem;
-}
-
-.diagram {
- flex: 1;
- min-width: 400px;
- background: #252540;
- border-radius: 8px;
- padding: 1rem;
- border: 1px solid var(--border-color);
-}
-
-.diagram h3 {
- font-size: 1.1rem;
- margin-bottom: 0.5rem;
- color: var(--accent-color);
-}
-
-.diagram img,
-.diagram object {
- width: 100%;
- height: auto;
- background: white;
- border-radius: 4px;
-}
-
-.diagram a {
- display: block;
- text-align: center;
- margin-top: 0.5rem;
- color: var(--accent-color);
- text-decoration: none;
- font-size: 0.9rem;
-}
-
-.diagram a:hover {
- text-decoration: underline;
-}
-
-.legend {
- margin-top: 2rem;
- padding: 1rem;
- background: #252540;
- border-radius: 8px;
- border: 1px solid var(--border-color);
-}
-
-.legend h3 {
- margin-bottom: 0.5rem;
-}
-
-.legend ul {
- list-style: none;
- display: flex;
- flex-wrap: wrap;
- gap: 1rem;
-}
-
-.legend li {
- display: flex;
- align-items: center;
- gap: 0.5rem;
-}
-
-.legend .color-box {
- width: 16px;
- height: 16px;
- border-radius: 3px;
-}
-
-code {
- background: #333;
- padding: 0.2rem 0.4rem;
- border-radius: 3px;
- font-family: 'Monaco', 'Consolas', monospace;
- font-size: 0.9em;
-}
-
-pre {
- background: #252540;
- padding: 1rem;
- border-radius: 8px;
- overflow-x: auto;
- border: 1px solid var(--border-color);
-}
-
-pre code {
- background: none;
- padding: 0;
-}
-
-/* Responsive: collapse sidebar on small screens */
-@media (max-width: 768px) {
- .sidebar {
- position: static;
- width: 100%;
- height: auto;
- border-right: none;
- border-bottom: 1px solid var(--border-color);
- }
-
- .sidebar ul {
- flex-direction: row;
- flex-wrap: wrap;
- }
-
- .content {
- margin-left: 0;
- }
-
- .diagram {
- min-width: 100%;
- }
-}
diff --git a/docs/index.html b/docs/index.html
index 413af1a..99e6598 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -1,380 +1,564 @@
-
+
-
-
-
- MPR - Architecture
-
-
-
-
+
+
+
+MPR — Detection Pipeline Architecture
+
+
+
+
+
+ MPR
+ Media Processing & Detection Pipeline — Architecture
+
+
+
+
+
+
+
+
+
+
+
+ OVERVIEW
+ A guided tour of the platform — start here for narrative context before the diagrams.
+
+
+
What MPR is
+
MPR is a brand / logo / text detection pipeline for video. A user picks chunks of source material into a Timeline, then runs a Profile (pipeline topology + per-stage config) against it. The pipeline extracts frames, filters scenes, runs CV (field segmentation, edge detection) and detection (YOLO, OCR), resolves text to a session brand list, and escalates anything still unresolved to a local VLM and then to cloud VLM providers. Output is a brand timeline and per-brand stats.
+
+
Where things run
+
The architecture spans four boxes: the browser (Vue 3 detection-app + OpenCV WASM worker for fast CV iteration), the K8s cluster (Envoy Gateway, FastAPI, detection-ui, Postgres, Redis, MinIO — Kind in dev via Tilt), a separate GPU host on the LAN running the inference server (YOLO, OCR, local VLM), and cloud VLM providers (Anthropic, Gemini, OpenAI, Groq) for last-resort escalation. See System.
+
+
Replay loop
+
The system is built around iteration. Checkpoint rows form a tree of "what configs did we try at this stage" (no blobs); StageOutput is a flat upsert table holding each stage's output dict. A single stage can be re-run in place using upstream StageOutput rows, so the UI loop is "tweak config → replay one stage → look at the overlay" without rerunning the whole pipeline. Frame caches keyed by timeline_id are reused across replays.
+
+
Profiles, not overrides
+
Profiles live in Postgres as two JSONB blobs — pipeline (stages + edges + routing) and configs (per-stage parameters). The convention is to duplicate a profile and tweak it, not to layer overrides at the call site. Job-level config_overrides exist but are merged on top of the resolved profile in core/detect/graph/nodes.py.
+
+
Inference indirection
+
Every CV/ML stage takes an INFERENCE_URL argument. Empty (the dev default) runs CV in-process; set, the stage POSTs to core/gpu/server.py on the GPU host. Heavy ML deps (torch, transformers, paddleocr) live only in core/gpu/pyproject.toml — the API host doesn't need them.
+
+
API and SSE
+
FastAPI under /detect/* (core/api/detect/): sources, run/stop/pause/resume/step, status, replay, checkpoints, overlays, config. Pipeline events fan out through Redis to GET /detect/stream/{job_id} as SSE. Envoy keeps the SSE connection open for up to 3600s.
+
+
Codegen
+
Source-of-truth dataclasses live in core/schema/models/. The standalone modelgen tool emits SQLModel ORM (core/db/models.py), Pydantic schemas, TypeScript types, and Protobuf definitions. Regenerate everything with bash ctrl/generate.sh.
+
+
+
+
+
+ SYSTEM ARCHITECTURE
+ Browser ↔ Envoy Gateway ↔ FastAPI / detection-ui ↔ data plane (Postgres / Redis / MinIO) ↔ LAN GPU host ↔ cloud VLM providers.
+
+

+
+
+ Browser
+ K8s cluster
+ GPU host (LAN)
+ Cloud VLM
+
+
+
+
+ DETECTION PIPELINE
+ 11 named stages from core/detect/graph/nodes.py. The runner flattens the profile's PipelineConfig graph into a linear sequence and runs each stage with cancel / pause / resume / step control.
+
+

+
+
+ Browser / WASM-eligible
+ GPU inference
+ Cloud VLM
+
+
+
Control flow. Each stage runs inside trace_node(), emits running → done/skipped via core/detect/emit.py, and writes its result to a StageOutput row keyed by (job_id, stage_name). Between stages the runner checks three job-keyed flags: cancel (set_cancel_check), pause/resume (threading.Event), and pause-after-stage / step.
+
Skip flags. SKIP_VLM=1 emits skipped for escalate_vlm; SKIP_CLOUD=1 for escalate_cloud. Useful in CI and dev when you don't want to burn provider credits.
+
Full pipeline reference →
+
+
+
+
+ PROFILES & CHECKPOINTS
+ Profiles are the config mechanism; checkpoints + StageOutput power the replay loop.
+
+
+
Profile shape
+
One Profile row per content type (e.g. soccer_broadcast) holds two JSONB blobs:
+
+ pipeline — a PipelineConfig: stages + edges + routing rules. The runner topologically sorts the edges, falling back to stage order when no edges are defined.
+ configs — {stage_name: {...}} per-stage parameters: fps, thresholds, prompts, etc. Each stage parses its slice into a typed config (FrameExtractionConfig, OCRConfig, ...).
+
+
Convention: duplicate a profile and tweak it rather than patching defaults at the call site. Job-level config_overrides exist for one-off experiments but the resolved profile is the durable artifact.
+
+
Checkpoint tree
+
A Checkpoint row is a tree node: (parent_id, stage_name, config_overrides, stats). No blobs. Lets the UI show a branching history of "what configs did we try at this stage" without dragging frame data around.
+
+
StageOutput (flat upsert)
+
One row per (job_id, stage_name) holding the stage's output dict. Single-stage replay reads upstream outputs from here, so re-running match_brands with a tweaked threshold doesn't redo OCR. POST /replay-stage is the entry point.
+
+
Replay loop
+
The detection-app UI is the test surface: change a config, replay one stage, see the overlay rendered from the cached frame plus the new StageOutput. Frame caches keyed by timeline_id survive across replays — extract_frames only fires on the first run for a timeline.
+
+
+
+
+
+ INFERENCE TOPOLOGY
+ Stages can run in three places. The split is what keeps the dev box light and lets one GPU host serve the whole team.
+
+
+
Browser (OpenCV WASM)
+
Field and edge stages can run in a Web Worker via ui/detection-app/src/cv/wasmBridge.ts using OpenCV WASM directly — no TypeScript ports of the algorithms. This is the fast-iteration path for the replay loop: tweak a kernel size, rerun the stage on the cached frames, see the overlay update without touching a server.
+
+
API host (in-process)
+
With INFERENCE_URL="" (the dev default in ctrl/k8s/base/configmap.yaml) every CV/ML stage calls its routine in-process. Useful when there's no GPU host wired up; works for everything except heavy YOLO/VLM workloads.
+
+
GPU host (LAN)
+
Set INFERENCE_URL=http://gpu-host:8000 and the same stages POST to core/gpu/server.py. The GPU server exposes /detect, /ocr, /preprocess, /vlm, /detect_edges, /segment_field — each with a /debug variant that returns intermediate masks for the overlay viewer. Heavy ML deps live only in core/gpu/pyproject.toml; the API host doesn't import torch.
+
+
Cloud VLM providers
+
Last-resort escalation for unresolved candidates. core/detect/providers/ wraps Anthropic, Gemini, OpenAI, and Groq. Selection is per-profile config; SKIP_CLOUD=1 bypasses the stage entirely.
+
+
+
+
+
+ DATA MODEL
+ Tables generated by modelgen from core/schema/models/ into core/db/models.py (SQLModel).
+
+

+
+
+
+ MediaAsset — source video file with probe metadata (duration, fps, codec).
+ Profile — pipeline topology + per-stage config (JSONB).
+ Timeline — user-created selection of chunks from a source asset.
+ Job — one pipeline run on a timeline; parent_id chains replays into a tree.
+ Checkpoint — tree node of stage state, no blobs.
+ StageOutput — flat upsert per (job, stage), holds output JSONB and an optional checkpoint_id.
+ Brand — canonical name, aliases, source (ocr/local_vlm/cloud_llm/manual), airing history.
+
+
+
+
+
+ API
+ FastAPI under /detect/* (mounted from core/api/detect/). Through Envoy Gateway in dev the public path is /api/detect/...; /api/detect/stream/* gets an extended idle timeout for SSE.
+# Sources / timelines
+GET /sources
+GET /sources/{job_id}/chunks
+POST /timeline
+GET /timeline
+GET /timeline/{id}
+DELETE /timeline/{id}/cache
+
+# Run control
+POST /run
+POST /stop/{job_id}
+POST /pause/{job_id}
+POST /resume/{job_id}
+POST /step/{job_id}
+POST /pause-after-stage/{job_id}
+GET /status/{job_id}
+POST /clear/{job_id}
+
+# Live events
+GET /stream/{job_id} # SSE
+
+# Replay / checkpoints / overlays
+GET /checkpoints/{timeline_id}
+GET /checkpoints/{timeline_id}/{stage}
+GET /scenarios
+POST /replay
+POST /replay-stage
+POST /overlays
+GET /overlays/{timeline_id}/{job_id}/{stage}/{seq}
+
+# Config
+GET /config
+PUT /config
+GET /config/profiles
+GET /config/profiles/{name}/pipeline
+PUT /config/edge-transform
+GET /config/stages
+GET /config/stages/{stage_name}
+
+# Jobs
+GET /jobs
+GET /jobs/{id}
+
+
+
+ STORAGE
+ S3-compatible everywhere — MinIO locally, real S3 / GCS / R2 in cloud targets. The same boto3 code path serves both; only S3_ENDPOINT_URL and credentials change.
+
+
+ mpr-media-in — source video files (chunks).
+ mpr-media-out — per-job artifacts: extracted frame caches, debug overlays.
+
+
Heavy artifacts (frames, masks, overlays) live in object storage. Checkpoint and StageOutput rows in Postgres hold structured outputs and references to S3 keys, never blobs.
+
Full storage reference →
+
+
+
+
+ CODE GENERATION
+ Source-of-truth dataclasses in core/schema/models/ → typed code in four targets.
+
+
+ - SQLModel ORM tables →
core/db/models.py
+ - Pydantic schemas (API request / response models)
+ - TypeScript types (UI)
+ - Protobuf definitions (gRPC stubs in
core/rpc/)
+
+
+# regenerate everything
+bash ctrl/generate.sh
+
+
+
+ DEV ENVIRONMENT
+ Tilt + Kind for local dev. Routing via Envoy Gateway on port 8080 — no nginx-ingress.
+
+
The Tiltfile lives at ctrl/Tiltfile and applies the kustomize overlay ctrl/k8s/overlays/dev/. Cluster name: kind-mpr. Tilt port-forwards Envoy (8080) and MinIO (9000 API, 9001 console).
+
+ /api/detect/stream/* → FastAPI SSE (3600s idle timeout)
+ /api/* → FastAPI
+ /, /detection/* → detection-ui (with WS upgrade for Vite HMR)
+
+
+# Add to /etc/hosts
+127.0.0.1 mpr.local.ar k8s.mpr.local.ar
+
+# Bring the cluster up
+cd ctrl
+./kind-create.sh # one-time
+tilt up # builds + applies + port-forwards
+
+# UI: http://k8s.mpr.local.ar:8080/
+# API: http://k8s.mpr.local.ar:8080/api/
+# MinIO: http://localhost:9001 (console; admin / minioadmin)
+
+# Force a UI rebuild
+tilt trigger detection-ui
+
+
+
+ QUICK REFERENCE
+ Common commands and switches for working in MPR.
+# Render SVGs from DOT files
for f in docs/architecture/*.dot; do dot -Tsvg "$f" -o "${f%.dot}.svg"; done
-# Switch executor mode
-MPR_EXECUTOR=local # Celery + MinIO
-MPR_EXECUTOR=lambda # Step Functions + Lambda + S3
-MPR_EXECUTOR=gcp # Cloud Run Jobs + GCS
-
-
+
# Regenerate models from core/schema/models/
+bash ctrl/generate.sh
+
+
# Switch inference between local and GPU host
+INFERENCE_URL=
# local (CV runs in API process)
+INFERENCE_URL=http://gpu-host:8000
# remote (core/gpu/server.py)
+
+
# Skip VLM escalation paths
+SKIP_VLM=1
+SKIP_CLOUD=1
+
+
# Tilt
+cd ctrl && tilt up
+tilt trigger detection-ui
+
+
+
+
+
+
+
+
+
+
diff --git a/docs/media-storage.html b/docs/media-storage.html
deleted file mode 100644
index 4d30a51..0000000
--- a/docs/media-storage.html
+++ /dev/null
@@ -1,125 +0,0 @@
-Media Storage Architecture
-Overview
-MPR separates media into input and output paths, each independently configurable. File paths are stored relative to their respective root to ensure portability between local development and cloud deployments (AWS S3, etc.).
-Storage Strategy
-Input / Output Separation
-| Path | Env Var | Purpose |
-|------|---------|---------|
-| MEDIA_IN | /app/media/in | Source media files to process |
-| MEDIA_OUT | /app/media/out | Transcoded/trimmed output files |
-These can point to different locations or even different servers/buckets in production.
-File Path Storage
-
-- Database: Stores only the relative path (e.g.,
videos/sample.mp4)
-- Input Root: Configurable via
MEDIA_IN env var
-- Output Root: Configurable via
MEDIA_OUT env var
-- Serving: Base URL configurable via
MEDIA_BASE_URL env var
-
-Why Relative Paths?
-
-- Portability: Same database works locally and in cloud
-- Flexibility: Easy to switch between storage backends
-- Simplicity: No need to update paths when migrating
-
-Local Development
-Configuration
-bash
-MEDIA_IN=/app/media/in
-MEDIA_OUT=/app/media/out
-File Structure
-/app/media/
-├── in/ # Source files
-│ ├── video1.mp4
-│ ├── video2.mp4
-│ └── subfolder/
-│ └── video3.mp4
-└── out/ # Transcoded output
- ├── video1_h264.mp4
- └── video2_trimmed.mp4
-Database Storage
-```
-Source assets (scanned from media/in)
-filename: video1.mp4
-file_path: video1.mp4
-filename: video3.mp4
-file_path: subfolder/video3.mp4
-```
-URL Serving
-
-- Nginx serves input via
location /media/in { alias /app/media/in; }
-- Nginx serves output via
location /media/out { alias /app/media/out; }
-- Frontend accesses:
http://mpr.local.ar/media/in/video1.mp4
-- Video player:
<video src="/media/in/video1.mp4" />
-
-AWS/Cloud Deployment
-S3 Configuration
-```bash
-Input and output can be different buckets/paths
-MEDIA_IN=s3://source-bucket/media/
-MEDIA_OUT=s3://output-bucket/transcoded/
-MEDIA_BASE_URL=https://source-bucket.s3.amazonaws.com/media/
-```
-S3 Structure
-```
-s3://source-bucket/media/
-├── video1.mp4
-└── subfolder/
- └── video3.mp4
-s3://output-bucket/transcoded/
-├── video1_h264.mp4
-└── video2_trimmed.mp4
-```
-Database Storage (Same!)
-```
-filename: video1.mp4
-file_path: video1.mp4
-filename: video3.mp4
-file_path: subfolder/video3.mp4
-```
-API Endpoints
-Scan Media Folder
-http
-POST /api/assets/scan
-Behavior:
-1. Recursively scans MEDIA_IN directory
-2. Finds all video/audio files (mp4, mkv, avi, mov, mp3, wav, etc.)
-3. Stores paths relative to MEDIA_IN
-4. Skips already-registered files (by filename)
-5. Returns summary: { found, registered, skipped, files }
-Create Job
-```http
-POST /api/jobs/
-Content-Type: application/json
-{
- "source_asset_id": "uuid",
- "preset_id": "uuid",
- "trim_start": 10.0,
- "trim_end": 30.0
-}
-```
-Behavior:
-- Server sets output_path using MEDIA_OUT + generated filename
-- Output goes to the output directory, not alongside source files
-Migration Guide
-Moving from Local to S3
-
--
-
Upload source files to S3:
- bash
- aws s3 sync /app/media/in/ s3://source-bucket/media/
- aws s3 sync /app/media/out/ s3://output-bucket/transcoded/
-
--
-
Update environment variables:
- bash
- MEDIA_IN=s3://source-bucket/media/
- MEDIA_OUT=s3://output-bucket/transcoded/
- MEDIA_BASE_URL=https://source-bucket.s3.amazonaws.com/media/
-
--
-
Database paths remain unchanged (already relative)
-
-
-Supported File Types
-Video: .mp4, .mkv, .avi, .mov, .webm, .flv, .wmv, .m4v
-Audio: .mp3, .wav, .flac, .aac, .ogg, .m4a
\ No newline at end of file
diff --git a/docs/viewer.html b/docs/viewer.html
new file mode 100644
index 0000000..f2869e6
--- /dev/null
+++ b/docs/viewer.html
@@ -0,0 +1,97 @@
+
+
+
+
+Graph Viewer
+
+
+
+
+
![]()
+
+
+
+