wtffmpeg
is a command-line tool that uses a Large Language Model (LLM) to translate plain English descriptions of video and audio tasks into executable ffmpeg
commands. It can use local models via Ollama or other OpenAI-compatible APIs, as well as the official OpenAI API.
Stop searching through Stack Overflow and documentation for that one specific ffmpeg
flag. Just ask for what you want.
Example:
> wtff "convert test_pattern.mp4 to a gif file"
INFO: No API key or LLM_API_URL env var provided. Defaulting to local Ollama at http://localhost:11434
--- Generated ffmpeg Command ---
ffmpeg -i test_pattern.mp4 -vf "fps=10,scale=320:-1:flags=lanczos" output.gif
------------------------------
Execute? [y/N], or (c)opy to clipboard: y
Executing: ffmpeg -i test_pattern.mp4 -vf "fps=10,scale=320:-1:flags=lanczos" output.gif
ffmpeg version n7.0 Copyright (c) 2000-2024 the FFmpeg developers
...
- Natural Language Interface: Describe complex
ffmpeg
operations in plain English. - Flexible LLM Backend:
- Local First: Connects to local LLMs like Ollama by default.
- Cloud Support: Easily switch to the OpenAI API or other services using an API key or bearer token.
- Interactive Execution: Reviews the generated command before giving you the option to execute it.
- Clipboard Integration: Copy commands to your clipboard with a single keypress.
- Interactive Mode: Run multiple commands in a persistent session.
This script uses uv
to manage its dependencies.
-
Install
uv
: If you don't haveuv
installed, you can install it with:curl -LsSf https://astral.sh/uv/install.sh | sh source "$HOME/.cargo/env"
-
Clone and Run: The script is designed to be run directly from the cloned repository.
uv
will automatically create a virtual environment and install the required dependencies (openai
andpyperclip
) on the first run.git clone https://github.com/scottvr/wtffmpeg.git cd wtffmpeg chmod +x wtffmpeg.py ./wtffmpeg.py "your first prompt"
-
Create an Alias (Optional): For easier access, you can create a symbolic link to the script in a directory that is in your system's
PATH
.sudo ln -s "$(pwd)/wtffmpeg.py" /usr/local/bin/wtff
Now you can run the tool from anywhere by simply typing
wtff
.
wtffmpeg
can be configured using command-line arguments or environment variables.
You can create a .env
file in the project directory or export these variables in your shell. An example.env
file is provided.
WTFFMPEG_MODEL
: The model to use (e.g.,gpt-oss:20b
,llama3
).WTFFMPEG_LLM_API_URL
: The base URL for your local LLM API (e.g.,http://localhost:11434
). This is the default connection method if no API key is provided.WTFFMPEG_OPENAI_API_KEY
: Your API key for the OpenAI service.WTFFMPEG_BEARER_TOKEN
: A bearer token for authentication with other OpenAI-compatible services.
The tool decides which service to use based on the following priority:
- If
--api-key
orWTFFMPEG_OPENAI_API_KEY
is provided, it will connect to the official OpenAI API. - If not, it will look for
--bearer-token
orWTFFMPEG_BEARER_TOKEN
to authenticate with a custom API endpoint. - If neither of the above is present, it will fall back to using the
--url
orWTFFMPEG_LLM_API_URL
, defaulting to a local Ollama instance athttp://localhost:11434
.
usage: wtffmpeg.py [-h] [--model MODEL] [--api-key API_KEY] [--bearer-token BEARER_TOKEN] [--url URL] [-x] [-c] [-i] [prompt]
Translate natural language to an ffmpeg command.
positional arguments:
prompt The natural language instruction for the ffmpeg command. Required unless running in interactive mode.
options:
-h, --help show this help message and exit
--model MODEL The model to use. For Ollama, this should be a model you have downloaded. Defaults to the WTFFMPEG_MODEL env var, then 'gpt-oss:20b'.
--api-key API_KEY OpenAI API key. Defaults to WTFFMPEG_OPENAI_API_KEY environment variable.
--bearer-token BEARER_TOKEN
Bearer token for authentication. Defaults to WTFFMPEG_BEARER_TOKEN environment variable.
--url URL Base URL for a local LLM API (e.g., http://localhost:11434). Defaults to WTFFMPEG_LLM_API_URL env var, then http://localhost:11434. The '/v1' suffix for OpenAI compatibility will be added automatically.
-x, --execute Execute the generated command without confirmation.
-c, --copy Copy the generated command to the clipboard.
-i, --interactive Enter interactive mode to run multiple commands.
Using the Default Local Ollama Instance:
# The tool defaults to http://localhost:11434 and the 'gpt-oss:20b' model
wtff "turn presentation.mov into a web-friendly mp4"
# Specify a different local model
wtff --model "codellama:7b" "extract the audio from lecture.mp4"
Using the OpenAI API:
# The tool will automatically switch to 'gpt-4o' as the default model
export OPENAI_API_KEY="sk-..."
wtff "create a 10-second clip from movie.mkv starting at the 2 minute mark"
# Or provide the key as an argument
wtff --api-key "sk-..." "resize video.mp4 to 720p"
Interactive Mode:
# Start an interactive session with a specific model
wtff --model "gemma:2b" -i
Inside interactive mode, you can type !
followed by a shell command to execute it directly. This is useful for listing files (!ls -l
) or making quick edits.
This was largely made to amuse myself; consider it a piece of humorous performance art but it so borders on being actually useful, I went to the trouble to document all of this. YMMV. Use at your own risk. The author is not responsible for any damage or data loss that may occur from using this tool. Always review generated commands before executing them, especially when working with important files.