182 lines
11 KiB
Plaintext
182 lines
11 KiB
Plaintext
media-analyzer/backend/media_analyzer/urls.py
|
|
why MEDIA_ROOT just in debug
|
|
|
|
this to store media files which is obvious not ideal to store in the db (at least on postgres)
|
|
|
|
|
|
nginx
|
|
why add cors straigth in nginx conf
|
|
|
|
|
|
> is it correct to have cors configuration in nginx?
|
|
|
|
● Yes and No - it depends on the service responsibility:
|
|
|
|
NGINX CORS is correct when:
|
|
- NGINX serves static media files (HLS videos)
|
|
- Browser needs to access video files cross-origin
|
|
- NGINX is the media server
|
|
|
|
Django CORS is correct when:
|
|
- Django serves API endpoints
|
|
- Angular makes API calls cross-origin
|
|
- Django is the application server
|
|
|
|
Your current setup needs BOTH:
|
|
- Django CORS: For Angular → Django API calls (/api/streams/)
|
|
- NGINX CORS: For Angular → NGINX media requests (/playlist.m3u8, /segment.ts)
|
|
|
|
Separation of concerns:
|
|
Angular (4200) → Django (8000) # API calls - Django CORS
|
|
Angular (4200) → NGINX (8081) # Video files - NGINX CORS
|
|
|
|
In production/K8s, you'd typically:
|
|
- Have one ingress controller handle all CORS
|
|
- Or use service mesh (Istio) for cross-origin policies
|
|
|
|
But for development, having CORS in both services is correct because they serve different content types to the browser.
|
|
|
|
|
|
on AI feature decide a model design, that Im not sure what actually matters
|
|
|
|
> can you explain why this model design?
|
|
|
|
● Looking at this model design, here's the reasoning:
|
|
|
|
VideoAnalysis - Central hub that tracks each analysis run with metadata like processing time, confidence thresholds, and frame timestamps. This allows performance monitoring and
|
|
result correlation.
|
|
|
|
ObjectDetection - YOLO results with bounding boxes. Multiple detections can belong to one analysis (one frame can have many objects).
|
|
|
|
SceneAnalysis - CLIP results for scene understanding, text detection, and embeddings. One-to-one with VideoAnalysis since CLIP analyzes the whole frame context.
|
|
|
|
ProcessingQueue - Decouples video processing from real-time streaming. Allows configurable analysis types per segment (you might want only YOLO for some streams, CLIP+YOLO for
|
|
others).
|
|
|
|
Key design decisions:
|
|
- Separation of concerns: YOLO (objects) vs CLIP (scenes/text) have different data structures
|
|
- Flexible processing: Queue supports different AI model combinations per segment
|
|
- Performance tracking: Processing times and confidence thresholds for optimization
|
|
- Scalability: UUIDs and proper indexing for distributed processing
|
|
- Real-time ready: Frame timestamps enable overlay synchronization
|
|
|
|
The analysis_types JSON field in ProcessingQueue is particularly important - it lets you configure which AI models to run per stream (e.g., ['yolo', 'clip_scene'] for object detection
|
|
+ scene analysis).
|
|
|
|
|
|
|
|
still hardcoding stuff???
|
|
|
|
class TextDetectionAdapterFactory(AdapterFactory): │ │
|
|
"""Factory for text detection adapters""" │ │
|
|
@staticmethod │ │
|
|
def create(provider_config): │ │
|
|
provider_type = provider_config.get('provider_type') │ │
|
|
│ │
|
|
if provider_type == 'gcp_vision': │ │
|
|
return GCPTextDetectionAdapter() │ │
|
|
elif provider_type == 'local_tesseract': │ │
|
|
return TesseractTextDetectionAdapter() │ │
|
|
else: │ │
|
|
raise ValueError(f"Unknown text detection provider: {provider_type}")
|
|
|
|
|
|
add aliases to avoid relative imports
|
|
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │
|
|
│ │ ../frontend/src/app/components/stream-viewer/stream-viewer.component.ts │ │
|
|
│ │ │ │
|
|
│ │ 1 import { Component, ElementRef, Input, ViewChild, AfterViewInit, OnDestroy, OnChanges } from '@angular/core'; │ │
|
|
│ │ 2 import Hls from 'hls.js'; │ │
|
|
│ │ 3 + import { Analysis, DetectionResult } from '../../models/analysis'; │ │
|
|
│ │ 4 │ │
|
|
│ │ 5 @Component({ │ │
|
|
│ │ 6 selector: 'app-stream-viewer', │ │
|
|
│ ╰───────────────────────────────────────────────────────────
|
|
|
|
|
|
|
|
|
|
I need more details on why celery for the AI processing task, I reckon that some queue is needed for compute and/or IO heavy interactions like AI processing which not just passing through what comes from the stream but yeah, need some depth
|
|
|
|
|
|
Your system should now show live logo detection when you hold up branded items! 🏷️
|
|
|
|
> cool, just a question dont do anything but I need more details on why celery for the AI processing task, I reckon that some queue is needed for compute and/or IO heavy interactions
|
|
like AI processing which not just passing through what comes from the stream but yeah, need some depth
|
|
|
|
● You're absolutely right to question this! Let me explain the specific reasons for Celery in AI video processing:
|
|
|
|
Why Celery for AI Processing?
|
|
|
|
1. Processing Time Reality
|
|
|
|
- CLIP model inference: 200-500ms per frame on CPU, 50-100ms on GPU
|
|
- Frame extraction: 50-200ms per video segment
|
|
- Total per segment: 250-700ms minimum
|
|
- Problem: Django request-response cycle would timeout, blocking the web server
|
|
|
|
2. HLS Streaming Constraints
|
|
|
|
- HLS segments: Created every 2-6 seconds by FFmpeg
|
|
- Analysis lag: If processing takes 500ms, you're always behind
|
|
- User experience: Video player can't wait for AI analysis to complete
|
|
- Solution: Async processing lets video serve immediately, AI catches up
|
|
|
|
3. Resource Management
|
|
|
|
- Memory intensive: Each CLIP model uses 400MB-1GB RAM
|
|
- CPU bound: AI inference blocks the entire process
|
|
- Multiple streams: 5 concurrent streams = 5x resource usage
|
|
- Queue benefits: Limit concurrent AI tasks, prevent memory exhaustion
|
|
|
|
4. Scaling Considerations
|
|
|
|
- Single server: Process 2-3 streams simultaneously
|
|
- Multiple workers: Celery scales across machines/GPUs
|
|
- Priority queues: Real-time streams get priority over batch analysis
|
|
- Failure handling: Retry failed analyses, dead letter queues
|
|
|
|
Alternative Approaches
|
|
|
|
Without Celery (synchronous):
|
|
# This blocks the Django thread for 500ms+
|
|
def serve_hls_file(request, filename):
|
|
# ... serve file ...
|
|
analysis = analyze_frame_now(segment) # BLOCKS HERE
|
|
return response # Takes 500ms+ total
|
|
|
|
With threading (risky):
|
|
# Memory leaks, no queue management, hard to monitor
|
|
import threading
|
|
threading.Thread(target=analyze_frame, args=(segment,)).start()
|
|
|
|
With asyncio (complex):
|
|
# Requires entire Django app to be async, complex error handling
|
|
async def analyze_frame_async():
|
|
# Works but harder to manage resources
|
|
|
|
Your Architecture Benefits
|
|
|
|
Current flow:
|
|
1. HLS segment created → Served immediately (fast)
|
|
2. Analysis queued → Celery worker picks it up
|
|
3. Results ready → WebSocket pushes to frontend
|
|
4. User sees: Video plays smoothly + analysis appears seconds later
|
|
|
|
Without queue:
|
|
1. HLS segment created → Wait for AI analysis (slow)
|
|
2. User experience: Choppy video, timeouts
|
|
3. Server: One slow AI task blocks everything
|
|
|
|
Business Value
|
|
|
|
- Real-time streaming: Video never stutters
|
|
- Scalable: Add more Celery workers as needed
|
|
- Reliable: Failed analyses retry automatically
|
|
- Monitorable: Queue depth shows system load
|
|
|
|
For media monitoring at scale (like Seenka's use case), you need this separation because analyzing "millones de emisiones" means handling thousands of concurrent streams - impossible
|
|
without proper async processing.
|
|
|
|
The queue is essential for production-grade video AI systems!
|