Skip to content

WiseArts/transcribe.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

🎙 transcribe.py

A clean, terminal-based video and audio transcription tool powered by OpenAI Whisper — fully open-source, runs locally, no API keys required.

Python 3.8+ License: MIT


Features

  • 🎬 Transcribe video or audio files directly from the terminal
  • 📁 Batch-transcribe all supported media files in a folder with one shared setting set
  • ⚡ Choose from 5 Whisper model sizes — from blazing-fast to highly accurate
  • 📄 Export to plain text, SRT, WebVTT, or JSON (with timestamps)
  • 🌍 Automatic language detection — no configuration needed
  • 🔒 Runs entirely offline — your files never leave your machine
  • 💅 Clean, interactive UI powered by Rich

Requirements

  • Python 3.8+
  • ffmpeg (system-level)

Installation

1. Clone the repo

git clone https://github.com/WiseArts/transcribe.git
cd transcribe

2. Install ffmpeg

# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt install ffmpeg

# Windows (via Chocolatey)
choco install ffmpeg

3. Create and activate a virtual environment

python3 -m venv .venv

# macOS / Linux
source .venv/bin/activate

# Windows (PowerShell)
.venv\Scripts\Activate.ps1

4. Install Python dependencies

python -m pip install --upgrade pip
pip install openai-whisper rich

Note: Use the tool from inside the activated virtual environment each time. The first time you run it, Whisper will automatically download the selected model weights and cache them locally. This is a one-time download per model.


Usage

# macOS / Linux
source .venv/bin/activate
python transcribe.py

On Windows PowerShell, activate it with .venv\Scripts\Activate.ps1 before running python transcribe.py.

The tool now lets you choose between:

  1. Single file mode — transcribe one video/audio file
  2. Folder batch mode — transcribe all supported files in one folder using the same model + output format

After choosing a mode, it walks you through:

  1. Input — file path or folder path (drag & drop into the terminal works on most systems)
  2. Model — pick a size based on how fast vs. accurate you need it
  3. Output format — choose how you want transcripts saved

The output file is saved alongside your source file (e.g. interview.mp4interview.srt).

In folder mode, each file is saved next to its source file (e.g. clip01.mp4clip01.srt).


Model Options

# Model Speed Quality VRAM Best for
1 tiny ██████████ ███░░░░░░░ ~1 GB Quick drafts, fast machines
2 base ████████░░ █████░░░░░ ~1 GB Everyday use (default)
3 small ██████░░░░ ███████░░░ ~2 GB Better accuracy, still fast
4 medium ████░░░░░░ █████████░ ~5 GB High quality, multilingual
5 large ██░░░░░░░░ ██████████ ~10 GB Best possible accuracy

Supported File Formats

Video: .mp4 .mov .avi .mkv .webm .flv

Audio: .mp3 .wav .m4a .aac .ogg .flac


Output Formats

Format Extension Description
Plain text .txt Clean transcript, one line per segment
SRT .srt Subtitles with timestamps (video players, Premiere, etc.)
WebVTT .vtt Web subtitles for HTML5 <video> tags
JSON .json Full Whisper output with segment-level confidence data

Performance Notes

  • CPU vs GPU: The script uses CPU by default (fp16=False) so it works on any machine. If you have an NVIDIA GPU with CUDA, remove the fp16=False flag in the transcribe() call for a significant speedup.
  • Speed: As a rough guide on CPU, base transcribes roughly 4–8× real-time speed. A 10-minute video takes around 2–3 minutes.
  • Accuracy: Whisper performs best on clear speech with minimal background noise. The medium and large models handle accents and technical vocabulary noticeably better.

Dependencies

Package Purpose
openai-whisper Speech-to-text transcription
rich Terminal UI
ffmpeg Audio extraction from video files

License

MIT — do whatever you like with it.

About

A clean, terminal-based video and audio transcription tool powered by OpenAI Whisper — fully open-source, runs locally, no API keys required.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages