update docs
This commit is contained in:
145
docs/architecture/05-detection-pipeline.md
Normal file
145
docs/architecture/05-detection-pipeline.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# Detection Pipeline — Execution Path
|
||||
|
||||
## Overview
|
||||
|
||||
A pipeline run is a sequence of named **stages** that read and write a shared
|
||||
`DetectState` dict. Stages are defined in `core/detect/stages/`; the orchestrator
|
||||
(`core/detect/graph/runner.py`) flattens the profile's `PipelineConfig` graph into a
|
||||
linear order, runs each stage, and emits SSE events to the browser.
|
||||
|
||||
The full stage list is in `core/detect/graph/nodes.py`:
|
||||
|
||||
```
|
||||
extract_frames → filter_scenes
|
||||
→ field_segmentation → detect_edges
|
||||
→ detect_objects → preprocess → run_ocr
|
||||
→ match_brands → escalate_vlm → escalate_cloud
|
||||
→ compile_report
|
||||
```
|
||||
|
||||
See `03-detection-pipeline.svg` for the graph view.
|
||||
|
||||
## Profile
|
||||
|
||||
A `Profile` row in Postgres holds two JSONB blobs:
|
||||
|
||||
- `pipeline` — a `PipelineConfig` (stages + edges + routing rules) defining topology
|
||||
- `configs` — `{stage_name: {...}}` per-stage parameters (fps, thresholds, prompts, ...)
|
||||
|
||||
Profiles are the config mechanism: **duplicate a profile and tweak it** instead of
|
||||
patching defaults. `core/detect/profile.py` loads profiles by name; `_load_profile()`
|
||||
in `nodes.py` merges the job's `config_overrides` on top.
|
||||
|
||||
## Stage runner
|
||||
|
||||
`PipelineRunner` (in `core/detect/graph/runner.py`) iterates the flattened stages and
|
||||
between each one checks three control flags (all keyed by `job_id`):
|
||||
|
||||
- **cancel** — `set_cancel_check(job_id, fn)`; raises `PipelineCancelled` to abort
|
||||
- **pause / resume** — a `threading.Event` per job; `_wait_if_paused()` blocks
|
||||
- **step** — like resume but auto-pauses after the next stage completes
|
||||
- **pause-after-stage** — toggle to step through every stage
|
||||
|
||||
Each stage runs inside `trace_node(state, name)` (sets a span used by tracing) and
|
||||
emits `running` → `done` (or `skipped`) transitions via `core/detect/emit.py`.
|
||||
|
||||
## Inference: GPU-host indirection
|
||||
|
||||
`core/detect/graph/nodes.py` reads `INFERENCE_URL` from the environment and passes it
|
||||
to every CV/ML stage:
|
||||
|
||||
- `INFERENCE_URL=""` (default in dev) — stages call CV/ML routines in-process
|
||||
- `INFERENCE_URL=http://gpu-host:8000` — stages POST to the GPU server
|
||||
(`core/gpu/server.py`) which exposes `/detect`, `/ocr`, `/preprocess`, `/vlm`,
|
||||
`/detect_edges`, `/segment_field` (each with a `/debug` variant that returns
|
||||
intermediate masks for the overlay viewer)
|
||||
|
||||
Memory note: dev and GPU machines are separate boxes on the same LAN; inference is a
|
||||
network call. Heavy ML deps (`torch`, `transformers`, `paddleocr`) live only in
|
||||
`core/gpu/pyproject.toml` — the API host doesn't need them.
|
||||
|
||||
## Browser-side CV (OpenCV WASM)
|
||||
|
||||
Some stages (notably the field/edge stages) can run in the browser via OpenCV WASM
|
||||
(`ui/detection-app/src/cv/wasmBridge.ts`) for fast iteration without a round trip to
|
||||
the GPU host. The browser UI is the test surface for the "replay loop" — change a
|
||||
config, replay one stage, see the overlay. Browser CV uses OpenCV WASM directly; there
|
||||
are no TypeScript ports of the algorithms.
|
||||
|
||||
## Cloud VLM escalation
|
||||
|
||||
`escalate_vlm` (local VLM on GPU host) and `escalate_cloud` (Anthropic / Gemini /
|
||||
OpenAI / Groq via `core/detect/providers/`) are the last-resort branches for
|
||||
unresolved candidates from `match_brands`. Skip flags:
|
||||
|
||||
- `SKIP_VLM=1` — emits `skipped` for `escalate_vlm`
|
||||
- `SKIP_CLOUD=1` — emits `skipped` for `escalate_cloud`
|
||||
|
||||
## Checkpoints, StageOutput, and replay
|
||||
|
||||
Two tables back the replay loop:
|
||||
|
||||
- **Checkpoint** (`core/db/models.py:Checkpoint`) — a tree node:
|
||||
`(parent_id, stage_name, config_overrides, stats)`. No blobs. Lets the UI show a
|
||||
branching history of "what configs did we try at this stage?"
|
||||
- **StageOutput** — a flat upsert table keyed by `(job_id, stage_name)` holding the
|
||||
stage's output dict. `replay-stage` reads upstream outputs from here so a single
|
||||
stage can be re-run without rerunning the whole pipeline.
|
||||
|
||||
API surface (`core/api/detect/replay.py`):
|
||||
|
||||
- `GET /checkpoints/{timeline_id}` — full tree
|
||||
- `POST /replay` — clone a checkpoint into a new job, run from a chosen stage
|
||||
- `POST /replay-stage` — re-run one stage in place using upstream `StageOutput` rows
|
||||
- `GET /overlays/{timeline_id}/{job_id}/{stage}/{seq}` — debug overlays from MinIO
|
||||
|
||||
## Event flow (SSE)
|
||||
|
||||
Stages call `emit.transition(...)` / `emit.log(...)` / `emit.boxes(...)` etc.
|
||||
(`core/detect/emit.py`). These push into Redis (`core/detect/events.py`). The SSE
|
||||
endpoint `GET /detect/stream/{job_id}` (`core/api/detect/sse.py`) drains the Redis
|
||||
list and writes to the open SSE response. Envoy keeps the connection open for up to
|
||||
3600s (see `ctrl/k8s/base/envoy.yaml`).
|
||||
|
||||
```
|
||||
stage code
|
||||
→ emit.* (core/detect/emit.py)
|
||||
→ push_detect_event → Redis RPUSH
|
||||
→ [poll] /detect/stream/{job_id} → SSE chunk
|
||||
→ fetch ReadableStream in detection-app
|
||||
→ Pinia store update → Vue panel re-render
|
||||
```
|
||||
|
||||
## Pipeline control endpoints
|
||||
|
||||
All under `core/api/detect/run.py`:
|
||||
|
||||
- `POST /run` — start a job from a timeline + profile
|
||||
- `POST /stop/{job_id}` — cancel
|
||||
- `POST /pause/{job_id}` / `POST /resume/{job_id}`
|
||||
- `POST /step/{job_id}` — run one stage and pause
|
||||
- `POST /pause-after-stage/{job_id}` — toggle pause-after-each-stage
|
||||
- `GET /status/{job_id}` — current stage, progress
|
||||
- `POST /clear/{job_id}` — discard runtime state
|
||||
|
||||
## Where the chunker UI fits
|
||||
|
||||
`ui/chunker/` is a **standalone testing utility** for the source-chunking step (split
|
||||
a long source video into chunks the user picks for a Timeline). It is **not** a
|
||||
pipeline stage and is not part of the detection flow. The detection pipeline reads
|
||||
already-chunked sources from MinIO via `core/api/detect/sources.py`.
|
||||
|
||||
## Files
|
||||
|
||||
| Concern | File |
|
||||
|---|---|
|
||||
| Stage list | `core/detect/graph/nodes.py` |
|
||||
| Runner (cancel/pause/resume) | `core/detect/graph/runner.py` |
|
||||
| Profile loading | `core/detect/profile.py` |
|
||||
| Event emission | `core/detect/emit.py`, `core/detect/events.py` |
|
||||
| SSE endpoint | `core/api/detect/sse.py` |
|
||||
| Replay API | `core/api/detect/replay.py` |
|
||||
| Checkpoint storage | `core/detect/checkpoint/storage.py` |
|
||||
| GPU server | `core/gpu/server.py` |
|
||||
| Browser CV bridge | `ui/detection-app/src/cv/wasmBridge.ts` |
|
||||
| Cloud VLM providers | `core/detect/providers/` |
|
||||
Reference in New Issue
Block a user