3.4 KiB
3.4 KiB
01 - Scene Detection Sensitivity, Image Quality, and Granular Caching
Date
2025-10-28
Context
Last run on zaca-run-scrapers sample (Zed editor walkthrough) only detected 19 frames with 7+ minute gaps. Whisper wasn't running (flag not passed). JPEG compression quality was poor for code/text readability.
Problems Identified
- Scene detection too conservative - Default threshold of 30.0 missed file switches and scrolling in clean UI (Zed vs VS Code)
- No whisper transcription - User expected it to run but
--run-whisperis opt-in - Poor JPEG quality - Default compression made code/text hard to read for OCR/vision
- Subprocess-based FFmpeg - Using shell commands instead of Python library
- All-or-nothing caching -
--no-cacheregenerates everything including slow whisper transcription
Changes Made
1. Scene Detection Sensitivity
Files: meetus/frame_extractor.py, process_meeting.py, meetus/workflow.py
- Lowered default threshold:
30.0→15.0(more sensitive for clean UIs) - Added
--scene-thresholdCLI argument (0-100, lower = more sensitive) - Added threshold to manifest for tracking
- Updated docstring with usage guidelines:
- 15.0: Good for clean UIs like Zed
- 20-30: Busy UIs like VS Code
- 5-10: Very subtle changes
2. JPEG Quality Improvements
Files: meetus/frame_extractor.py
- Interval extraction: Added
cv2.IMWRITE_JPEG_QUALITY, 95(line 60) - Scene detection: Added
-q:v 2to FFmpeg (best quality, line 94)
3. Migration to ffmpeg-python
Files: meetus/frame_extractor.py, requirements.txt
- Replaced
subprocess.run()withffmpeg-pythonlibrary - Cleaner, more Pythonic API
- Better error handling with
ffmpeg.Error - Added to requirements.txt
4. Granular Cache Control
Files: process_meeting.py, meetus/workflow.py, meetus/cache_manager.py
Added three new flags for selective cache invalidation:
--skip-cache-frames: Regenerate frames (useful when tuning scene threshold)--skip-cache-whisper: Rerun whisper transcription--skip-cache-analysis: Rerun OCR/vision analysis
Key design:
--no-cache: Still works as before (new directory + regenerate everything)- New flags: Reuse existing output directory but selectively invalidate caches
- Frames are cleaned up when regenerating to avoid stale data
Typical Workflow
# First run - generate everything including whisper (expensive, once)
python process_meeting.py samples/video.mkv --run-whisper --scene-detection --use-vision
# Iterate on scene threshold without re-running whisper
python process_meeting.py samples/video.mkv --scene-detection --scene-threshold 10 --use-vision --skip-cache-frames --skip-cache-analysis
# Try even more sensitive
python process_meeting.py samples/video.mkv --scene-detection --scene-threshold 5 --use-vision --skip-cache-frames --skip-cache-analysis
Notes
- Whisper is the most expensive and reliable step → always cache it during iteration
- Scene detection needs tuning per UI style (Zed vs VS Code)
- Vision analysis should regenerate when frames change
- Walking through code (file switches, scrolling) should trigger scene changes
Files Modified
meetus/frame_extractor.py- Scene threshold, quality, ffmpeg-pythonmeetus/workflow.py- Cache flags, frame cleanupmeetus/cache_manager.py- Granular cache checksprocess_meeting.py- CLI argumentsrequirements.txt- Added ffmpeg-python