abogen

Abogen is a web-first text-to-speech workstation. Drop in an EPUB, PDF, Markdown, or plain text file and Abogen will turn it into high-quality audio with perfectly synced subtitles. The new interface runs entirely inside your browser using Flask + htmx, so it behaves like a modern web app whether you launch it locally or from a container.

Highlights

Natural-sounding speech powered by Kokoro-82M with per-job voice, speed, GPU toggle, and subtitle style controls
Clean dashboard that tracks the status, progress, and logs of every job in real time (thanks to htmx partial updates)
Automatic chapter detection and subtitle generation with SRT/ASS exports
Runs well in Docker, ships a REST-style JSON API, and works across macOS, Linux, and Windows

Quick start

Abogen supports Python 3.10–3.12.

Install with pip

python -m venv .venv
source .venv/bin/activate  # On Windows use: .venv\Scripts\activate
pip install abogen

Launch the web app

abogen

Then open http://localhost:8808 and drag in your documents. Jobs run in the background worker and the browser updates automatically.

Tip: Keep the terminal open while the server is running. Use Ctrl+C to stop it.

Container image

A lightweight Dockerfile lives in abogen/Dockerfile.

docker build -t abogen .
mkdir -p ~/abogen-data/uploads ~/abogen-data/outputs
docker run --rm \
  -p 8808:8808 \
  -v ~/abogen-data:/data \
  --name abogen \
  abogen

Browse to http://localhost:8808. Uploaded source files are stored in /data/uploads and rendered audio/subtitles appear in /data/outputs.

Container environment variables

Variable	Default	Purpose
`ABOGEN_HOST`	`0.0.0.0`	Bind address for the Flask server
`ABOGEN_PORT`	`8808`	HTTP port
`ABOGEN_DEBUG`	`false`	Enable Flask debug mode
`ABOGEN_UPLOAD_ROOT`	`/data/uploads`	Directory where uploaded files are stored
`ABOGEN_OUTPUT_ROOT`	`/data/outputs`	Directory for generated audio and subtitles (legacy alias of `ABOGEN_OUTPUT_DIR`)
`ABOGEN_OUTPUT_DIR`	`/data/outputs`	Container path for rendered audio/subtitles
`ABOGEN_SETTINGS_DIR`	`/config`	Container path for JSON settings/configuration
`ABOGEN_TEMP_DIR`	`/data/cache` (Docker) or platform cache dir	Container path for temporary audio working files
`ABOGEN_UID`	`1000`	UID that the container should run as (matches host user)
`ABOGEN_GID`	`1000`	GID that the container should run as (matches host group)

Set any of these with -e VAR=value when starting the container.

To discover your local UID/GID for matching file permissions inside the container, run:

id -u
id -g

Use those values to populate ABOGEN_UID / ABOGEN_GID in your .env file.

When running via Docker Compose, set ABOGEN_SETTINGS_DIR, ABOGEN_OUTPUT_DIR, and ABOGEN_TEMP_DIR in your .env file to the host directories you want mounted into the container. Compose maps them to /config, /data/outputs, and /data/cache respectively while exporting those in-container paths to the application. Non-audio caches (e.g., Hugging Face downloads) stick to the container's internal cache under /tmp/abogen-home/.cache by default, so only conversion scratch data touches the mounted ABOGEN_TEMP_DIR. Ensure each host directory exists and is writable by the UID/GID you configure before starting the stack.

Docker Compose (GPU by default)

The repo includes docker-compose.yaml, which targets GPU hosts out of the box. Install the NVIDIA Container Toolkit and run:

docker compose up -d --build

Key build/runtime knobs:

TORCH_VERSION – pin a specific PyTorch release that matches your driver (leave blank for the latest on the configured index).
TORCH_INDEX_URL – swap out the PyTorch download index when targeting a different CUDA build.
ABOGEN_DATA – host path that stores uploads/outputs (defaults to ./data).

CPU-only deployment: comment out the deploy.resources.reservations.devices block (and the optional runtime: nvidia line) inside the compose file. Compose will then run without requesting a GPU. If you prefer the classic CLI:

docker build -f abogen/Dockerfile -t abogen-gpu .
docker run --rm \
  --gpus all \
  -p 8808:8808 \
  -v ~/abogen-data:/data \
  abogen-gpu

GPU acceleration

Abogen detects CUDA automatically. To use an NVIDIA GPU, install the matching PyTorch build before installing Abogen:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install abogen

On Linux with AMD GPUs, install PyTorch/ROCm nightly wheels:

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.4

Abogen falls back to CPU rendering if no GPU is available.

Using the web UI

Upload a document (drag & drop or use the upload button).
Choose voice, language, speed, subtitle style, and output format.
Click Create job. The job immediately appears in the queue.
Watch progress and logs update live. Download audio/subtitle assets when complete.
Cancel or delete jobs any time. Download logs for troubleshooting.

Multiple jobs can run sequentially; the worker processes them in order.

JSON endpoints

Need machine-readable status updates? The dashboard calls a small set of helper endpoints you can reuse:

GET /api/jobs/<id> returns job metadata, progress, and log lines in JSON.
GET /partials/jobs renders the live job list as HTML (htmx uses this for polling).
GET /partials/jobs/<id>/logs renders just the log window.

More automation hooks are planned; contributions are very welcome if you need additional routes.

Configuration reference

Most behaviour is controlled through the UI, but a few environment variables are helpful for automation:

ABOGEN_SECRET_KEY – provide your own random secret when deploying across multiple replicas.
ABOGEN_DEBUG – set to true for verbose Flask error output.
ABOGEN_SETTINGS_DIR – change where Abogen stores its JSON settings/configuration files.
ABOGEN_TEMP_DIR – change where temporary uploads and cache files are stored.
ABOGEN_OUTPUT_DIR – change where rendered audio/subtitles are written.

If unset, Abogen picks sensible defaults suitable for local usage.

You can also create a .env file in the project root (see .env.example) to configure these paths when running locally. The application loads .env automatically on startup.

Development workflow

git clone https://github.com/denizsafak/abogen.git
cd abogen
python -m venv .venv
source .venv/bin/activate
pip install -e .
pip install pytest

Run the server in development mode:

export ABOGEN_DEBUG=true
abogen

Static files live in abogen/web/static, templates in abogen/web/templates, and the conversion pipeline in abogen/web/conversion_runner.py.

Tests

python -m pytest

Unit tests cover the queue service, web routes, and conversion pipeline helpers. Contributions that add features should include new tests whenever practical.

Upgrading from the desktop GUI

The legacy PyQt5 interface is no longer packaged. Existing scripts that call abogen.main should switch to the new web entry point (abogen.web.app:main). The new experience works headlessly, plays nicely in Docker, and exposes JSON APIs for automation.

Troubleshooting

Conversion jobs stay pending → ensure the background worker has write access to the upload/output directories.
GPU not detected → verify the correct PyTorch wheel is installed (pip show torch) and drivers match the container/host.
Subtitle files missing → check the job configuration; subtitles are optional and can be disabled per job.
Logs are empty → run with ABOGEN_DEBUG=true to get verbose Flask error output in the server console.

If you hit a bug, open an issue describing the input file and the exact log output.

Contributing

Pull requests are welcome! Please:

Keep changes focused and well-tested
Run python -m pytest
Update documentation when behaviour changes

Thanks for helping make Abogen a great open-source audiobook generator.

Name		Name	Last commit message	Last commit date
Latest commit History 367 Commits
.github/workflows		.github/workflows
abogen		abogen
demo		demo
docs		docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
WINDOWS_INSTALL.bat		WINDOWS_INSTALL.bat
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

abogen

Highlights

Quick start

Install with pip

Launch the web app

Container image

Container environment variables

Docker Compose (GPU by default)

GPU acceleration

Using the web UI

JSON endpoints

Configuration reference

Development workflow

Tests

Upgrading from the desktop GUI

Troubleshooting

Contributing

About

Uh oh!

Releases

Packages

Languages

License

jeremiahsb/abogen

Folders and files

Latest commit

History

Repository files navigation

abogen

Highlights

Quick start

Install with pip

Launch the web app

Container image

Container environment variables

Docker Compose (GPU by default)

GPU acceleration

Using the web UI

JSON endpoints

Configuration reference

Development workflow

Tests

Upgrading from the desktop GUI

Troubleshooting

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages