About
Many voices, one machine.
Fiction lives in its voices. A novel populated by twenty characters — each with a distinct cadence, register, an age — collapses to a single performance when a narrator reads it aloud. That's not a knock on narrators; it's a structural limit of the single-voice format. When Hermione speaks exactly like Dumbledore, something is quietly lost.
The cast audiobook — with a different voice for every character — solves this. But it costs tens of thousands of dollars in studio time and talent. So it happens for roughly the top-1% of books, once, and never again. Every other book, every backlist reissue, every beloved series that didn't quite break through gets the flat narrator and the lost texture.
Castwright is a bet: that a full cast, on your own hardware, is the better outcome — better than a metered cloud service that charges per character and never forgets what you generated, better than nothing. The voices run locally. Your books stay on your machine. The series memory persists between books. Any book, performed by a full cast — kept true, kept yours, book after book.
We're honest about the limitations. You'll want a gaming PC or laptop with an 8 GB NVIDIA GPU — that's the benchmarked path; Apple silicon works, slower. It's source-available, not OSI open source: the code goes public with the beta, competing redistribution is restricted for two years, then it converts to Apache-2.0. Voice cloning — the "even in your own voice" promise — is coming in the next release, not yet in your hands.
What's next: the companion listener app (iOS and Android) ships at launch. Voice cloning — your own voice, on your machine, with your consent — comes in the first release after that. Then eight more languages (the Qwen3-TTS engine already speaks ten), the app itself in your language — Russian first — and MCP support, so your AI assistant can drive the whole pipeline. We're building toward the version where every book can be performed.
Castwright stands on open-source speech engines — credited by name:
- Kokoro — fast, high-quality English voices (Apache-2.0)
- Coqui XTTS — voice cloning from a short sample (CPML; download-on-demand, user-supplied)
- Qwen — voice design: build a character's voice from a description (Apache-2.0; verify per release)