6.1 KiB
Detection Pipeline — Execution Path
Overview
A pipeline run is a sequence of named stages that read and write a shared
DetectState dict. Stages are defined in core/detect/stages/; the orchestrator
(core/detect/graph/runner.py) flattens the profile's PipelineConfig graph into a
linear order, runs each stage, and emits SSE events to the browser.
The full stage list is in core/detect/graph/nodes.py:
extract_frames → filter_scenes
→ field_segmentation → detect_edges
→ detect_objects → preprocess → run_ocr
→ match_brands → escalate_vlm → escalate_cloud
→ compile_report
See 03-detection-pipeline.svg for the graph view.
Profile
A Profile row in Postgres holds two JSONB blobs:
pipeline— aPipelineConfig(stages + edges + routing rules) defining topologyconfigs—{stage_name: {...}}per-stage parameters (fps, thresholds, prompts, ...)
Profiles are the config mechanism: duplicate a profile and tweak it instead of
patching defaults. core/detect/profile.py loads profiles by name; _load_profile()
in nodes.py merges the job's config_overrides on top.
Stage runner
PipelineRunner (in core/detect/graph/runner.py) iterates the flattened stages and
between each one checks three control flags (all keyed by job_id):
- cancel —
set_cancel_check(job_id, fn); raisesPipelineCancelledto abort - pause / resume — a
threading.Eventper job;_wait_if_paused()blocks - step — like resume but auto-pauses after the next stage completes
- pause-after-stage — toggle to step through every stage
Each stage runs inside trace_node(state, name) (sets a span used by tracing) and
emits running → done (or skipped) transitions via core/detect/emit.py.
Inference: GPU-host indirection
core/detect/graph/nodes.py reads INFERENCE_URL from the environment and passes it
to every CV/ML stage:
INFERENCE_URL=""(default in dev) — stages call CV/ML routines in-processINFERENCE_URL=http://gpu-host:8000— stages POST to the GPU server (core/gpu/server.py) which exposes/detect,/ocr,/preprocess,/vlm,/detect_edges,/segment_field(each with a/debugvariant that returns intermediate masks for the overlay viewer)
Memory note: dev and GPU machines are separate boxes on the same LAN; inference is a
network call. Heavy ML deps (torch, transformers, paddleocr) live only in
core/gpu/pyproject.toml — the API host doesn't need them.
Browser-side CV (OpenCV WASM)
Some stages (notably the field/edge stages) can run in the browser via OpenCV WASM
(ui/detection-app/src/cv/wasmBridge.ts) for fast iteration without a round trip to
the GPU host. The browser UI is the test surface for the "replay loop" — change a
config, replay one stage, see the overlay. Browser CV uses OpenCV WASM directly; there
are no TypeScript ports of the algorithms.
Cloud VLM escalation
escalate_vlm (local VLM on GPU host) and escalate_cloud (Anthropic / Gemini /
OpenAI / Groq via core/detect/providers/) are the last-resort branches for
unresolved candidates from match_brands. Skip flags:
SKIP_VLM=1— emitsskippedforescalate_vlmSKIP_CLOUD=1— emitsskippedforescalate_cloud
Checkpoints, StageOutput, and replay
Two tables back the replay loop:
- Checkpoint (
core/db/models.py:Checkpoint) — a tree node:(parent_id, stage_name, config_overrides, stats). No blobs. Lets the UI show a branching history of "what configs did we try at this stage?" - StageOutput — a flat upsert table keyed by
(job_id, stage_name)holding the stage's output dict.replay-stagereads upstream outputs from here so a single stage can be re-run without rerunning the whole pipeline.
API surface (core/api/detect/replay.py):
GET /checkpoints/{timeline_id}— full treePOST /replay— clone a checkpoint into a new job, run from a chosen stagePOST /replay-stage— re-run one stage in place using upstreamStageOutputrowsGET /overlays/{timeline_id}/{job_id}/{stage}/{seq}— debug overlays from MinIO
Event flow (SSE)
Stages call emit.transition(...) / emit.log(...) / emit.boxes(...) etc.
(core/detect/emit.py). These push into Redis (core/detect/events.py). The SSE
endpoint GET /detect/stream/{job_id} (core/api/detect/sse.py) drains the Redis
list and writes to the open SSE response. Envoy keeps the connection open for up to
3600s (see ctrl/k8s/base/envoy.yaml).
stage code
→ emit.* (core/detect/emit.py)
→ push_detect_event → Redis RPUSH
→ [poll] /detect/stream/{job_id} → SSE chunk
→ fetch ReadableStream in detection-app
→ Pinia store update → Vue panel re-render
Pipeline control endpoints
All under core/api/detect/run.py:
POST /run— start a job from a timeline + profilePOST /stop/{job_id}— cancelPOST /pause/{job_id}/POST /resume/{job_id}POST /step/{job_id}— run one stage and pausePOST /pause-after-stage/{job_id}— toggle pause-after-each-stageGET /status/{job_id}— current stage, progressPOST /clear/{job_id}— discard runtime state
Where the chunker UI fits
ui/chunker/ is a standalone testing utility for the source-chunking step (split
a long source video into chunks the user picks for a Timeline). It is not a
pipeline stage and is not part of the detection flow. The detection pipeline reads
already-chunked sources from MinIO via core/api/detect/sources.py.
Files
| Concern | File |
|---|---|
| Stage list | core/detect/graph/nodes.py |
| Runner (cancel/pause/resume) | core/detect/graph/runner.py |
| Profile loading | core/detect/profile.py |
| Event emission | core/detect/emit.py, core/detect/events.py |
| SSE endpoint | core/api/detect/sse.py |
| Replay API | core/api/detect/replay.py |
| Checkpoint storage | core/detect/checkpoint/storage.py |
| GPU server | core/gpu/server.py |
| Browser CV bridge | ui/detection-app/src/cv/wasmBridge.ts |
| Cloud VLM providers | core/detect/providers/ |