Skip to content
Castwright

Engines & models

Draft — verify against the current release.

Castwright ships with three synthesis engines. You do not need all three — Kokoro alone is enough to produce a full-cast audiobook.

Kokoro (default)

Kokoro is the default engine and the one Castwright uses for generation unless you choose otherwise. It is always resident in memory once the weights are installed, so there is no wait for a model to load between chapters.

  • Languages: English only (28 voices).
  • Voice names: prefixed af_, am_, bf_, bm_ (American/British female/male).
  • VRAM: roughly 1 GB.
  • When to use it: for every character when you want fast, reliable English synthesis.

Qwen (voice design)

The Qwen engine is used for designing custom voices. When you open the “Design voice” panel for a character, Castwright loads the Qwen VoiceDesign model, generates a voice from your description, and lets you compare the result with the default. You can accept the design, try another description, or discard it.

Qwen is also the synthesis engine for any character whose voice was designed with it. It is loaded on demand, not at startup.

  • Languages: multilingual.
  • VRAM: the design model is about 4–5 GB; it unloads automatically once you leave the cast-review screen.
  • When to use it: when you want a character voice that is precisely tuned — a specific accent, age, affect, or timbre that no preset covers.

Coqui XTTS v2 (optional)

Coqui XTTS is an optional engine for voice cloning. You load it explicitly via the Model Manager; it is not installed by default. Once loaded, it takes over from Kokoro for any character assigned a cloned voice.

  • Languages: multilingual.
  • VRAM: similar to Qwen; loading Coqui unloads the Ollama analysis model to reclaim memory.
  • When to use it: when you want to narrate in your own voice or clone a specific voice from a short audio sample. (Voice cloning is in development — it ships in an upcoming release.)

The Model Manager

The Model Manager lives under the account menu (or at #/models). It shows all installed and available engines, lets you install or remove voice weights, and displays current GPU memory usage. If an engine fails to load, the error will appear here first.