- 📁 Transcribe local audio and video files
- 🌍 Automatic or manual language detection
- ⏱️ Word & sentence-level timestamps
- 💾 Multiple output formats: SRT, JSON, TXT
- 🤖 Supports multiple State-of-the-art Whisper & WhisperX models
- ⚙️ Advanced and customizable transcription options & optimizations
- 👥 Speaker diarization
- 📚 Mass transcription
Before starting: you may face incompatibility issues with this method if your GPU is very new (e.g. RTX 50 series). I am working on it. In the meanwhile feel free to try alternative installation methods which may be more stable for different devices.
Download the corresponding binary files from the Releases page.
OS | Variant | Download | Extra prerequisite | How to run |
---|---|---|---|---|
Windows 10/11 | CPU | WhisperGUI-cpu-win64.exe |
– | Double-click |
CUDA | WhisperGUI-cuda-win64.7z.001 + .002 |
1. 7-Zip 2. NVIDIA drivers |
- Place both parts together - Right-click the .001 → 7-Zip ▸ Extract Here - Run WhisperGUI-cuda-win64.exe |
|
Linux x86-64 | CPU | WhisperGUI-cpu-x86_64.AppImage |
– | - chmod +x WhisperGUI-cpu-x86_64.AppImage - ./WhisperGUI-cpu-x86_64.AppImage |
CUDA | WhisperGUI-cuda-x86_64.7z.001 + .002 |
1. 7-Zip (sudo apt install p7zip-full )2. NVIDIA drivers |
- cat *.7z.* > all.7z && 7z x all.7z - chmod +x WhisperGUI-cuda-x86_64.AppImage - ./WhisperGUI-cuda-x86_64.AppImage |
Your transcriptions will be saved by default in the outputs
folder of the repository.
Note: Binaries are in alpha stage.
Not yet available: macOS and AMD GPU builds. Use the alternative installation methods.
- Anaconda or Miniconda installed and
conda
added to PATH. git
installed and added to PATH. See instructions.ffmpeg
installed and added to PATH. See instructions for Windows, Linux or macOS.
In Windows, run the whisper-gui.bat
file.
In Linux / macOS run the whisper-gui.sh
file.
Follow the instructions and let the script install the necessary dependencies. After the process, it will run the GUI in a new browser tab.
To run the program every time, you can just run the same whisper-gui.bat
or whisper-gui.sh
(whatever your OS), which will also automatically check for updates of this repository.
- Create a conda environment with Python 3.10
conda create --name whisperx python=3.10 conda activate whisperx
- Install PyTorch 2.0
For macOS:For Windows or Linux, if you have Nvidia GPU:conda install pytorch::pytorch==2.0.0 torchaudio==2.0.0 -c pytorch
For Linux, if you have AMD GPU:conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
If not, install for CPU:pip install torch==2.0.0 torchaudio==2.0.0 --index-url https://download.pytorch.org/whl/rocm6.0`
conda install pytorch==2.0.0 torchaudio==2.0.0 cpuonly -c pytorch
- Install whisperx and dependecies
pip install git+https://github.com/m-bain/whisperx.git
Original instructions in: https://github.com/m-bain/whisperX
- Install additional libraries
pip install gradio
- Clone this repository
git clone https://github.com/Pikurrot/whisper-gui
- Run the GUI
conda activate whisperx python main.py --autolaunch
To run this software in a docker container, visit this dockerhub project.
Thank you 3x3cut0r!
Simply run the following command in the repository root:
docker-compose up -d
Thank you MarkAC007!
This project is primarily distributed under the terms of the MIT License. See the LICENSE file for details.
Third-Party Code
Portions of this project incorporate code from WhisperX, which is licensed under BSD-4-Clause license. This code is used in accordance with its license, and the full text of the license can be found within the relevant source files.