Scribe is a command-line tool that downloads audio from video URLs, transcribes the content using OpenAI's Whisper, and generates markdown-formatted articles using Google's Gemini AI.
- Download audio from YouTube, Vimeo, and other platforms supported by yt-dlp
- Uses OpenAI Whisper with GPU acceleration when available
- Generates well-structured markdown articles using Gemini 2.0 Flash
- Handles format conversion and quality optimization
- Creates standalone articles with proper formatting and structure
- Python 3.11 or higher
- uv package manager
- FFmpeg (for audio processing)
- Google Gemini API key
- GPU recommended for faster Whisper transcription
- Clone the repository:
git clone https://github.com/senko/scribe.git
cd scribe- Set up the virtual environment and install dependencies:
uv sync- Set up your Gemini API key:
export GEMINI_API_KEY="your-api-key-here"python scribe.py <video_url> <description>Example:
python scribe.py "https://youtube.com/watch?v=abc123" "Interview with John Doe about AI development"The description gives more context to the AI summarizer, for example who is talking and what the overall topic is about.
The generated article will be printed to stdout in markdown format.
GEMINI_API_KEY(required): Your Google Gemini API key
The script uses:
- Local Whisper Model:
turbo(configurable inWHISPER_MODELconstant) - Gemini Model:
gemini-2.0-flash-exp
Common issues:
"GEMINI_API_KEY environment variable not set"
- Ensure you've set the API key:
export GEMINI_API_KEY="your-key"
"Error downloading audio"
- Check if the video URL is valid and accessible
- Ensure FFmpeg is installed and in your PATH
"Error transcribing audio"
- Check available disk space in temporary directory
- For GPU issues, verify CUDA installation or use CPU fallback
"Error generating summary"
- Verify your Gemini API key is valid
- Check your API quota and billing status
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).