MPR - Media Processor
Media transcoding platform with three execution modes: local (Celery + MinIO), AWS (Step Functions + Lambda + S3), and GCP (Cloud Run Jobs + GCS). Storage is S3-compatible across all environments.
System Overview
Local Architecture (Development)
Open full sizeAWS Architecture (Production)
Open full sizeGCP Architecture (Production)
Open full sizeComponents
- Reverse Proxy (nginx)
- Application Layer (Django Admin, GraphQL API, Timeline UI)
- Worker Layer (Celery local mode)
- AWS (Step Functions, Lambda)
- GCP (Cloud Run Jobs + GCS)
- Data Layer (PostgreSQL, Redis)
- S3-compatible Storage (MinIO / AWS S3 / GCS)
Data Model
Entity Relationships
Open full sizeEntities
- MediaAsset - Video/audio files with metadata
- TranscodePreset - Encoding configurations
- TranscodeJob - Processing queue items
Job Flow
Job Lifecycle
Open full sizeJob States
- PENDING - Waiting in queue
- PROCESSING - Worker executing
- COMPLETED - Success
- FAILED - Error occurred
- CANCELLED - User cancelled
Execution Modes
- Local: Celery + MinIO (S3 API) + FFmpeg
- Lambda: Step Functions + Lambda + AWS S3
- GCP: Cloud Run Jobs + GCS (S3 compat)
Media Storage
MPR separates media into input and output paths, each independently configurable. File paths are stored relative to their respective root to ensure portability between local development and cloud deployments.
Input / Output Separation
-
MEDIA_IN- Source media files to process -
MEDIA_OUT- Transcoded/trimmed output files
Why Relative Paths?
- Portability: Same database works locally and in cloud
- Flexibility: Easy to switch between storage backends
- Simplicity: No need to update paths when migrating
Local Development
MEDIA_IN=/app/media/in
MEDIA_OUT=/app/media/out
/app/media/
├── in/ # Source files
│ ├── video1.mp4
│ └── subfolder/video3.mp4
└── out/ # Transcoded output
└── video1_h264.mp4
AWS/Cloud Deployment
MEDIA_IN=s3://source-bucket/media/
MEDIA_OUT=s3://output-bucket/transcoded/
MEDIA_BASE_URL=https://source-bucket.s3.amazonaws.com/media/
Database paths remain unchanged (already relative). Just upload files to S3 and update environment variables.
Full Media Storage Documentation →
Chunker Pipeline
The chunker pipeline splits media into time-based segments, streaming real-time events from worker threads through Redis and gRPC-Web to the browser UI. 7 hops from worker thread to pixel.
Event Path
Worker thread → Pipeline._emit() → event_bridge() → Redis RPUSH
→ [50ms poll] gRPC server LRANGE → yield protobuf
→ HTTP/2 frame → Envoy (grpc-web filter)
→ HTTP/1.1 chunk → nginx (proxy_buffering off)
→ fetch ReadableStream → protobuf-ts decode
→ setEvents([...prev, evt]) → React re-render
Thread Model (inside Celery worker)
Celery worker process
└─ run_job task thread
└─ Pipeline.run()
├─ Producer thread — enqueues chunks
├─ Monitor thread — emits progress every 500ms
├─ Worker thread 0 — pulls from queue, processes
├─ Worker thread 1 — pulls from queue, processes
├─ Worker thread 2 — pulls from queue, processes
└─ Worker thread 3 — pulls from queue, processes
Infrastructure
nginx :80- Reverse proxy, static file servingfastapi :8702- GraphQL API (Strawberry)celery- Task worker (runs pipeline)redis :6379- Event bus + Celery brokergrpc :50051- gRPC server (StreamChunkPipeline)envoy :8090- gRPC-Web ↔ native gRPC translationminio :9000- S3-compatible source media storagepostgres :5432- Job/asset metadata
Full Chunker Pipeline Documentation →
API (GraphQL)
All client interactions go through GraphQL at
/graphql.
# GraphiQL IDE
http://mpr.local.ar/graphql
# Queries
query { assets(status: "ready") { id filename duration } }
query { jobs(status: "processing") { id status progress } }
query { presets { id name container videoCodec } }
query { systemStatus { status version } }
# Mutations
mutation { scanMediaFolder { found registered skipped } }
mutation { createJob(input: { sourceAssetId: "...", presetId: "..." }) { id status } }
mutation { cancelJob(id: "...") { id status } }
mutation { retryJob(id: "...") { id status } }
mutation { updateAsset(id: "...", input: { comments: "..." }) { id comments } }
mutation { deleteAsset(id: "...") { ok } }
# Lambda callback (REST)
POST /api/jobs/{id}/callback - Lambda completion webhook
Supported File Types:
Video: mp4, mkv, avi, mov, webm, flv, wmv, m4v
Audio: mp3, wav, flac, aac, ogg, m4a
Access Points
# Add to /etc/hosts
127.0.0.1 mpr.local.ar
# URLs
http://mpr.local.ar/admin - Django Admin
http://mpr.local.ar/graphql - GraphiQL IDE
http://mpr.local.ar/ - Timeline UI
http://mpr.local.ar/chunker/ - Chunker UI
http://localhost:9001 - MinIO Console
# AWS deployment
https://mpr.mcrn.ar/ - Production
Quick Reference
# Render SVGs from DOT files
for f in docs/architecture/*.dot; do dot -Tsvg "$f" -o "${f%.dot}.svg"; done
# Switch executor mode
MPR_EXECUTOR=local # Celery + MinIO
MPR_EXECUTOR=lambda # Step Functions + Lambda + S3
MPR_EXECUTOR=gcp # Cloud Run Jobs + GCS