This project is a web application that uses OpenAI Whisper to generate subtitles for videos or embed subtitles directly into video files.
The project includes:
- ⚙️ A FastAPI backend (running on port 8000)
- 🎨 A React frontend (running on port 3000)
- 🐳 A full Docker setup for easy execution
- Docker installed on your system
- Minimum system requirements vary depending on CPU or GPU usage:
- Dual-core processor (Intel/AMD)
- 8 GB RAM (16 GB recommended for large videos)
- No GPU required
- NVIDIA GPU with CUDA Compute Capability 3.5+
- Latest NVIDIA GPU Drivers
- CUDA Toolkit
- NVIDIA Container Toolkit
You can check your GPU compatibility here: CUDA GPUs
⚠️ Note: If you're using CUDA, make sure the NVIDIA Container Toolkit is correctly configured on your machine.
sudo apt update
sudo apt install -y nvidia-driver-535
sudo rebootVerify with:
nvidia-smidistribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
# Restart Docker Desktop manually or with: sudo systemctl restart docker (Linux only)We use .env files to switch between CPU and GPU builds.
cp .env.cpu .env
docker compose --env-file .env up --buildcp .env.gpu .env
docker compose --env-file .env up --build- The backend API will be available at:
http://localhost:8000 - The frontend UI will be available at:
http://localhost:3000
You can use this app from any device connected to the same Wi-Fi network by visiting:
http://<your-ip-address>:3000On the machine running the app, open a terminal and run:
ipconfigLook for the line: IPv4 Address.
ifconfig | grep 192.Copy the IP address (usually something like 192.168.x.x) and open in any browser:
http://192.168.x.x:3000
- Upload a video file
- Select input language and subtitle format
- Option to generate
.srt,.vtt,.txt, etc. - Option to embed subtitles directly into the video
- Download subtitle file and/or subtitled video
MIT License.