init commit

2025-10-19 22:17:38 -03:00
commit 93e0c06d38
10 changed files with 969 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,239 @@
+# Meeting Processor
+
+Extract screen content from meeting recordings and merge with Whisper transcripts for better Claude summarization.
+
+## Overview
+
+This tool enhances meeting transcripts by combining:
+- **Audio transcription** (from Whisper)
+- **Screen content** (OCR from screen shares)
+
+The result is a rich, timestamped transcript that provides full context for AI summarization.
+
+## Installation
+
+### 1. System Dependencies
+
+**Tesseract OCR** (recommended):
+```bash
+# Ubuntu/Debian
+sudo apt-get install tesseract-ocr
+
+# macOS
+brew install tesseract
+
+# Arch Linux
+sudo pacman -S tesseract
+```
+
+**FFmpeg** (for scene detection):
+```bash
+# Ubuntu/Debian
+sudo apt-get install ffmpeg
+
+# macOS
+brew install ffmpeg
+```
+
+### 2. Python Dependencies
+
+```bash
+pip install -r requirements.txt
+```
+
+### 3. Optional: Install Alternative OCR Engines
+
+```bash
+# EasyOCR (better for rotated/handwritten text)
+pip install easyocr
+
+# PaddleOCR (better for code/terminal screens)
+pip install paddleocr
+```
+
+## Quick Start
+
+### Basic Usage (Screen Content Only)
+
+```bash
+python process_meeting.py samples/meeting.mkv
+```
+
+This will:
+1. Extract frames every 5 seconds
+2. Run OCR to extract screen text
+3. Save enhanced transcript to `meeting_enhanced.txt`
+
+### With Whisper Transcript
+
+First, generate a Whisper transcript:
+```bash
+whisper samples/meeting.mkv --model base --output_format json
+```
+
+Then process with screen content:
+```bash
+python process_meeting.py samples/meeting.mkv --transcript samples/meeting.json
+```
+
+## Usage Examples
+
+### Extract frames at different intervals
+```bash
+# Every 10 seconds
+python process_meeting.py samples/meeting.mkv --interval 10
+
+# Every 3 seconds (more detailed)
+python process_meeting.py samples/meeting.mkv --interval 3
+```
+
+### Use scene detection (smarter, fewer frames)
+```bash
+python process_meeting.py samples/meeting.mkv --scene-detection
+```
+
+### Use different OCR engines
+```bash
+# EasyOCR (good for varied layouts)
+python process_meeting.py samples/meeting.mkv --ocr-engine easyocr
+
+# PaddleOCR (good for code/terminal)
+python process_meeting.py samples/meeting.mkv --ocr-engine paddleocr
+```
+
+### Extract frames only (no merging)
+```bash
+python process_meeting.py samples/meeting.mkv --extract-only
+```
+
+### Custom output location
+```bash
+python process_meeting.py samples/meeting.mkv --output my_meeting.txt --frames-dir my_frames/
+```
+
+### Enable verbose logging
+```bash
+# Show detailed debug information
+python process_meeting.py samples/meeting.mkv --verbose
+
+# Short form
+python process_meeting.py samples/meeting.mkv -v
+```
+
+## Output Files
+
+After processing, you'll get:
+
+- **`<video>_enhanced.txt`** - Enhanced transcript ready for Claude
+- **`<video>_ocr.json`** - Raw OCR data with timestamps
+- **`frames/`** - Extracted video frames (JPG files)
+
+## Workflow for Meeting Analysis
+
+### Complete Workflow
+
+```bash
+# 1. Extract audio and transcribe with Whisper
+whisper samples/alo-intro1.mkv --model base --output_format json
+
+# 2. Process video to extract screen content
+python process_meeting.py samples/alo-intro1.mkv \
+    --transcript samples/alo-intro1.json \
+    --scene-detection
+
+# 3. Use the enhanced transcript with Claude
+# Copy the content from alo-intro1_enhanced.txt and paste into Claude
+```
+
+### Example Prompt for Claude
+
+```
+Please summarize this meeting transcript. Pay special attention to:
+1. Key decisions made
+2. Action items
+3. Technical details shown on screen
+4. Any metrics or data presented
+
+[Paste enhanced transcript here]
+```
+
+## Command Reference
+
+```
+usage: process_meeting.py [-h] [--transcript TRANSCRIPT] [--output OUTPUT]
+                          [--frames-dir FRAMES_DIR] [--interval INTERVAL]
+                          [--scene-detection]
+                          [--ocr-engine {tesseract,easyocr,paddleocr}]
+                          [--no-deduplicate] [--extract-only]
+                          [--format {detailed,compact}] [--verbose]
+                          video
+
+Options:
+  video                 Path to video file
+  --transcript, -t      Path to Whisper transcript (JSON or TXT)
+  --output, -o          Output file for enhanced transcript
+  --frames-dir          Directory to save extracted frames (default: frames/)
+  --interval            Extract frame every N seconds (default: 5)
+  --scene-detection     Use scene detection instead of interval extraction
+  --ocr-engine          OCR engine: tesseract, easyocr, paddleocr (default: tesseract)
+  --no-deduplicate      Disable text deduplication
+  --extract-only        Only extract frames and OCR, skip transcript merging
+  --format              Output format: detailed or compact (default: detailed)
+  --verbose, -v         Enable verbose logging (DEBUG level)
+```
+
+## Tips for Best Results
+
+### Scene Detection vs Interval
+- **Scene detection**: Better for presentations with distinct slides. More efficient.
+- **Interval extraction**: Better for continuous screen sharing (coding, browsing). More thorough.
+
+### OCR Engine Selection
+- **Tesseract**: Best for clean slides, documents, presentations. Fast and lightweight.
+- **EasyOCR**: Better for handwriting, rotated text, or varied fonts.
+- **PaddleOCR**: Excellent for code, terminal outputs, and mixed languages.
+
+### Deduplication
+- Enabled by default - removes similar consecutive frames
+- Disable with `--no-deduplicate` if slides change subtly
+
+## Troubleshooting
+
+### "pytesseract not installed"
+```bash
+pip install pytesseract
+sudo apt-get install tesseract-ocr  # Don't forget system package!
+```
+
+### "No frames extracted"
+- Check video file is valid: `ffmpeg -i video.mkv`
+- Try lower interval: `--interval 3`
+- Check disk space in frames directory
+
+### Poor OCR quality
+- Try different OCR engine
+- Check if video resolution is sufficient
+- Use `--no-deduplicate` to keep more frames
+
+### Scene detection not working
+- Fallback to interval extraction automatically
+- Ensure FFmpeg is installed
+- Try manual interval: `--interval 5`
+
+## Project Structure
+
+```
+meetus/
+├── meetus/                  # Main package
+│   ├── __init__.py
+│   ├── frame_extractor.py   # Video frame extraction
+│   ├── ocr_processor.py     # OCR processing
+│   └── transcript_merger.py # Transcript merging
+├── process_meeting.py       # Main CLI script
+├── requirements.txt         # Python dependencies
+└── README.md               # This file
+```
+
+## License
+
+For personal use.