# Meeting Processor Extract screen content from meeting recordings and merge with Whisper transcripts for better Claude summarization. ## Overview This tool enhances meeting transcripts by combining: - **Audio transcription** (from Whisper) - **Screen content** (OCR from screen shares) The result is a rich, timestamped transcript that provides full context for AI summarization. ## Installation ### 1. System Dependencies **Tesseract OCR** (recommended): ```bash # Ubuntu/Debian sudo apt-get install tesseract-ocr # macOS brew install tesseract # Arch Linux sudo pacman -S tesseract ``` **FFmpeg** (for scene detection): ```bash # Ubuntu/Debian sudo apt-get install ffmpeg # macOS brew install ffmpeg ``` ### 2. Python Dependencies ```bash pip install -r requirements.txt ``` ### 3. Optional: Install Alternative OCR Engines ```bash # EasyOCR (better for rotated/handwritten text) pip install easyocr # PaddleOCR (better for code/terminal screens) pip install paddleocr ``` ## Quick Start ### Basic Usage (Screen Content Only) ```bash python process_meeting.py samples/meeting.mkv ``` This will: 1. Extract frames every 5 seconds 2. Run OCR to extract screen text 3. Save enhanced transcript to `meeting_enhanced.txt` ### With Whisper Transcript First, generate a Whisper transcript: ```bash whisper samples/meeting.mkv --model base --output_format json ``` Then process with screen content: ```bash python process_meeting.py samples/meeting.mkv --transcript samples/meeting.json ``` ## Usage Examples ### Extract frames at different intervals ```bash # Every 10 seconds python process_meeting.py samples/meeting.mkv --interval 10 # Every 3 seconds (more detailed) python process_meeting.py samples/meeting.mkv --interval 3 ``` ### Use scene detection (smarter, fewer frames) ```bash python process_meeting.py samples/meeting.mkv --scene-detection ``` ### Use different OCR engines ```bash # EasyOCR (good for varied layouts) python process_meeting.py samples/meeting.mkv --ocr-engine easyocr # PaddleOCR (good for code/terminal) python process_meeting.py samples/meeting.mkv --ocr-engine paddleocr ``` ### Extract frames only (no merging) ```bash python process_meeting.py samples/meeting.mkv --extract-only ``` ### Custom output location ```bash python process_meeting.py samples/meeting.mkv --output my_meeting.txt --frames-dir my_frames/ ``` ### Enable verbose logging ```bash # Show detailed debug information python process_meeting.py samples/meeting.mkv --verbose # Short form python process_meeting.py samples/meeting.mkv -v ``` ## Output Files After processing, you'll get: - **`