# 01 - Scene Detection Sensitivity, Image Quality, and Granular Caching ## Date 2025-10-28 ## Context Last run on zaca-run-scrapers sample (Zed editor walkthrough) only detected 19 frames with 7+ minute gaps. Whisper wasn't running (flag not passed). JPEG compression quality was poor for code/text readability. ## Problems Identified 1. **Scene detection too conservative** - Default threshold of 30.0 missed file switches and scrolling in clean UI (Zed vs VS Code) 2. **No whisper transcription** - User expected it to run but `--run-whisper` is opt-in 3. **Poor JPEG quality** - Default compression made code/text hard to read for OCR/vision 4. **Subprocess-based FFmpeg** - Using shell commands instead of Python library 5. **All-or-nothing caching** - `--no-cache` regenerates everything including slow whisper transcription ## Changes Made ### 1. Scene Detection Sensitivity **Files:** `meetus/frame_extractor.py`, `process_meeting.py`, `meetus/workflow.py` - Lowered default threshold: `30.0` → `15.0` (more sensitive for clean UIs) - Added `--scene-threshold` CLI argument (0-100, lower = more sensitive) - Added threshold to manifest for tracking - Updated docstring with usage guidelines: - 15.0: Good for clean UIs like Zed - 20-30: Busy UIs like VS Code - 5-10: Very subtle changes ### 2. JPEG Quality Improvements **Files:** `meetus/frame_extractor.py` - **Interval extraction**: Added `cv2.IMWRITE_JPEG_QUALITY, 95` (line 60) - **Scene detection**: Added `-q:v 2` to FFmpeg (best quality, line 94) ### 3. Migration to ffmpeg-python **Files:** `meetus/frame_extractor.py`, `requirements.txt` - Replaced `subprocess.run()` with `ffmpeg-python` library - Cleaner, more Pythonic API - Better error handling with `ffmpeg.Error` - Added to requirements.txt ### 4. Granular Cache Control **Files:** `process_meeting.py`, `meetus/workflow.py`, `meetus/cache_manager.py` Added three new flags for selective cache invalidation: - `--skip-cache-frames`: Regenerate frames (useful when tuning scene threshold) - `--skip-cache-whisper`: Rerun whisper transcription - `--skip-cache-analysis`: Rerun OCR/vision analysis **Key design:** - `--no-cache`: Still works as before (new directory + regenerate everything) - New flags: Reuse existing output directory but selectively invalidate caches - Frames are cleaned up when regenerating to avoid stale data ## Typical Workflow ```bash # First run - generate everything including whisper (expensive, once) python process_meeting.py samples/video.mkv --run-whisper --scene-detection --use-vision # Iterate on scene threshold without re-running whisper python process_meeting.py samples/video.mkv --scene-detection --scene-threshold 10 --use-vision --skip-cache-frames --skip-cache-analysis # Try even more sensitive python process_meeting.py samples/video.mkv --scene-detection --scene-threshold 5 --use-vision --skip-cache-frames --skip-cache-analysis ``` ## Notes - Whisper is the most expensive and reliable step → always cache it during iteration - Scene detection needs tuning per UI style (Zed vs VS Code) - Vision analysis should regenerate when frames change - Walking through code (file switches, scrolling) should trigger scene changes ## Files Modified - `meetus/frame_extractor.py` - Scene threshold, quality, ffmpeg-python - `meetus/workflow.py` - Cache flags, frame cleanup - `meetus/cache_manager.py` - Granular cache checks - `process_meeting.py` - CLI arguments - `requirements.txt` - Added ffmpeg-python