2.3 KiB
Media & Artifact Storage
Overview
MPR stores everything on S3-compatible object storage. Locally that's MinIO; in any
cloud target (AWS, GCS via HMAC, Cloudflare R2, etc.) it's the provider's S3 API. The
code in core/storage/ uses boto3 throughout — only the endpoint URL and credentials
change between environments.
What goes where
| Bucket / prefix | Contents | Producer | Consumer |
|---|---|---|---|
mpr-media-in |
Source video files (chunks the user uploaded or device-recorded) | user / chunker UI | extract_frames stage, core/api/detect/sources.py |
mpr-media-out |
Per-job artifacts: extracted frame caches, debug overlays | pipeline stages, core/api/detect/replay.py overlays endpoints |
UI panels (frame strip, overlay viewer) |
Both buckets live behind the same S3 client (core/storage/). DB rows store relative
keys (e.g. chunks/2025-04-15/match-01.mp4); the bucket is implicit.
Local development (MinIO)
S3_ENDPOINT_URL=http://minio:9000
S3_BUCKET_IN=mpr-media-in
S3_BUCKET_OUT=mpr-media-out
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
In the Tilt setup, MinIO runs as a k8s Deployment with port-forwards for 9000 (S3 API)
and 9001 (web console). A minio-init job creates the buckets on first start.
Cloud (AWS S3 / GCS / others)
# AWS S3 — no endpoint URL needed
S3_BUCKET_IN=...
S3_BUCKET_OUT=...
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
# GCS via HMAC
S3_ENDPOINT_URL=https://storage.googleapis.com
AWS_ACCESS_KEY_ID=<gcs hmac access>
AWS_SECRET_ACCESS_KEY=<gcs hmac secret>
Database vs. object storage
Heavy artifacts (frames, masks, overlays) live in MinIO/S3. The Checkpoint and
StageOutput tables in Postgres (see 02-data-model.svg) hold structured outputs
(detections, stats, references to S3 keys) — never blobs. Frame caches keyed by
timeline_id are written by the first run of extract_frames and reused by every
later replay on the same timeline.
Storage module
core/storage/ exposes the small set of helpers callers need:
from core.storage import (
get_s3_client,
list_objects,
download_file,
download_to_temp,
upload_file,
get_presigned_url,
BUCKET_IN,
BUCKET_OUT,
)
Anything else (multipart, lifecycle, versioning) is the bucket's responsibility, not the application's.