What is Voicebox?
Voicebox is a local-first, open-source voice synthesis studio built as a free alternative to ElevenLabs. It runs entirely on your machine — your voice data, your models, your privacy. Whether you need to clone a voice from a few seconds of audio, generate speech across 23 languages, or compose multi-voice narratives for podcasts and audiobooks, Voicebox handles it all without sending a single audio sample to the cloud.
Who it's for
Voicebox is designed for creators, developers, and privacy-conscious teams who need professional-grade voice synthesis without subscription fees or cloud dependency. Podcasters, game developers, accessibility tool builders, and indie content creators will find it particularly useful. Developers can integrate it into their own projects via the built-in REST API.
Key capabilities
- 7 TTS engines: Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro
- Zero-shot voice cloning from a short reference audio sample
- 50+ preset voices via Kokoro, plus 9 Qwen CustomVoice presets
- 23 languages including English, Japanese, Hindi, Arabic, and Swahili
- Expressive speech with paralinguistic tags like
[laugh],[sigh], and[gasp] - Post-processing effects: pitch shift, reverb, delay, chorus, compression, and filters
- Stories editor: a multi-track timeline for conversations and podcasts
- Native performance: built with Tauri (Rust), not Electron — fast and lightweight
- Runs on macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, and Docker
Why choose it over ElevenLabs?
ElevenLabs and similar platforms (Murf.ai, Play.ht, Speechify) charge per character generated and keep your voice data on their servers. Voicebox eliminates both concerns. There are no usage limits — generate as much audio as your GPU can handle. Your voice clones are stored locally, making it the only viable option for teams handling sensitive or proprietary audio content. With 21,000+ GitHub stars and active development, it is already one of the most capable open-source voice tools available.

