MPR - Media Processor

Media transcoding platform with three execution modes: local (Celery + MinIO), AWS (Step Functions + Lambda + S3), and GCP (Cloud Run Jobs + GCS). Storage is S3-compatible across all environments.

System Overview

Local Architecture (Development)

Local Architecture Open full size

AWS Architecture (Production)

AWS Architecture Open full size

GCP Architecture (Production)

GCP Architecture Open full size

Components

Data Model

Entity Relationships

Data Model Open full size

Entities

Job Flow

Job Lifecycle

Job Flow Open full size

Job States

Execution Modes

Media Storage

MPR separates media into input and output paths, each independently configurable. File paths are stored relative to their respective root to ensure portability between local development and cloud deployments.

Input / Output Separation

Why Relative Paths?

Local Development

MEDIA_IN=/app/media/in
MEDIA_OUT=/app/media/out

/app/media/
├── in/                    # Source files
│   ├── video1.mp4
│   └── subfolder/video3.mp4
└── out/                   # Transcoded output
    └── video1_h264.mp4

AWS/Cloud Deployment

MEDIA_IN=s3://source-bucket/media/
MEDIA_OUT=s3://output-bucket/transcoded/
MEDIA_BASE_URL=https://source-bucket.s3.amazonaws.com/media/

Database paths remain unchanged (already relative). Just upload files to S3 and update environment variables.

Full Media Storage Documentation →

Chunker Pipeline

The chunker pipeline splits media into time-based segments, streaming real-time events from worker threads through Redis and gRPC-Web to the browser UI. 7 hops from worker thread to pixel.

Event Path

Worker thread → Pipeline._emit() → event_bridge() → Redis RPUSH
  → [50ms poll] gRPC server LRANGE → yield protobuf
  → HTTP/2 frame → Envoy (grpc-web filter)
  → HTTP/1.1 chunk → nginx (proxy_buffering off)
  → fetch ReadableStream → protobuf-ts decode
  → setEvents([...prev, evt]) → React re-render

Thread Model (inside Celery worker)

Celery worker process
  └─ run_job task thread
       └─ Pipeline.run()
            ├─ Producer thread     — enqueues chunks
            ├─ Monitor thread      — emits progress every 500ms
            ├─ Worker thread 0     — pulls from queue, processes
            ├─ Worker thread 1     — pulls from queue, processes
            ├─ Worker thread 2     — pulls from queue, processes
            └─ Worker thread 3     — pulls from queue, processes

Infrastructure

Full Chunker Pipeline Documentation →

API (GraphQL)

All client interactions go through GraphQL at /graphql.

# GraphiQL IDE
http://mpr.local.ar/graphql

# Queries
query { assets(status: "ready") { id filename duration } }
query { jobs(status: "processing") { id status progress } }
query { presets { id name container videoCodec } }
query { systemStatus { status version } }

# Mutations
mutation { scanMediaFolder { found registered skipped } }
mutation { createJob(input: { sourceAssetId: "...", presetId: "..." }) { id status } }
mutation { cancelJob(id: "...") { id status } }
mutation { retryJob(id: "...") { id status } }
mutation { updateAsset(id: "...", input: { comments: "..." }) { id comments } }
mutation { deleteAsset(id: "...") { ok } }

# Lambda callback (REST)
POST /api/jobs/{id}/callback      - Lambda completion webhook

Supported File Types:

Video: mp4, mkv, avi, mov, webm, flv, wmv, m4v
Audio: mp3, wav, flac, aac, ogg, m4a

Access Points

# Add to /etc/hosts
127.0.0.1 mpr.local.ar

# URLs
http://mpr.local.ar/admin         - Django Admin
http://mpr.local.ar/graphql       - GraphiQL IDE
http://mpr.local.ar/              - Timeline UI
http://mpr.local.ar/chunker/      - Chunker UI
http://localhost:9001              - MinIO Console

# AWS deployment
https://mpr.mcrn.ar/              - Production

Quick Reference

# Render SVGs from DOT files
for f in docs/architecture/*.dot; do dot -Tsvg "$f" -o "${f%.dot}.svg"; done

# Switch executor mode
MPR_EXECUTOR=local    # Celery + MinIO
MPR_EXECUTOR=lambda   # Step Functions + Lambda + S3
MPR_EXECUTOR=gcp      # Cloud Run Jobs + GCS