Skip to content

senko/scribe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scribe

Scribe is a command-line tool that downloads audio from video URLs, transcribes the content using OpenAI's Whisper, and generates markdown-formatted articles using Google's Gemini AI.

Features

  • Download audio from YouTube, Vimeo, and other platforms supported by yt-dlp
  • Uses OpenAI Whisper with GPU acceleration when available
  • Generates well-structured markdown articles using Gemini 2.0 Flash
  • Handles format conversion and quality optimization
  • Creates standalone articles with proper formatting and structure

Quick Start

Prerequisites

  • Python 3.11 or higher
  • uv package manager
  • FFmpeg (for audio processing)
  • Google Gemini API key
  • GPU recommended for faster Whisper transcription

Installation

  1. Clone the repository:
git clone https://github.com/senko/scribe.git
cd scribe
  1. Set up the virtual environment and install dependencies:
uv sync
  1. Set up your Gemini API key:
export GEMINI_API_KEY="your-api-key-here"

Usage

python scribe.py <video_url> <description>

Example:

python scribe.py "https://youtube.com/watch?v=abc123" "Interview with John Doe about AI development"

The description gives more context to the AI summarizer, for example who is talking and what the overall topic is about.

The generated article will be printed to stdout in markdown format.

Configuration

Environment Variables

  • GEMINI_API_KEY (required): Your Google Gemini API key

Model Configuration

The script uses:

  • Local Whisper Model: turbo (configurable in WHISPER_MODEL constant)
  • Gemini Model: gemini-2.0-flash-exp

Troubleshooting

Common issues:

"GEMINI_API_KEY environment variable not set"

  • Ensure you've set the API key: export GEMINI_API_KEY="your-key"

"Error downloading audio"

  • Check if the video URL is valid and accessible
  • Ensure FFmpeg is installed and in your PATH

"Error transcribing audio"

  • Check available disk space in temporary directory
  • For GPU issues, verify CUDA installation or use CPU fallback

"Error generating summary"

  • Verify your Gemini API key is valid
  • Check your API quota and billing status

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

About

Download and transcribe videos into blog posts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages