Models

Cartesia provides a family of state-of-the-art models, including our highly-accurate, low-latency Sonic TTS model family.

  • the latest stable snapshot of the model

To use the stable version of the model, we recommend using the base model name (e.g. sonic-2). In many cases the stable and preview snapshots are the same, but in some cases the preview snapshot may have additional features or improvements.

sonic-2

Sonic-2 is our most capable text to speech model. It provides ultra-realistic speech with accurate transcript following, minimal hallucinations, and best in class voice cloning. It’s latency optimized and achieves 90ms model latency.

Additional Capabilities:

  • Higher fidelity voice cloning
  • Timestamps for all 15 languages
  • Infill support
SnapshotRelease DateLanguagesStatus
sonic-2-2025-04-16April 16, 2025en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, trStable
sonic-2-2025-03-07March 7, 2025en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, trStable

Note: For versions after sonic-2-2025-03-07, _experimental_controls is not supported. If the latest model does not meet your needs, and you would like to use the controls, please make requests to sonic-2-2025-03-07.

To learn how to use the Sonic TTS family, see Make an API request.

sonic-turbo

All the power of Sonic, with half the latency (as low as 40ms).

SnapshotRelease DateLanguagesStatus
sonic-turbo-2025-03-07March 7, 2025en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, trStable

sonic

The first version of our flagship text-to-speech model. It produces high-accuracy, expressive speech, and is optimized for efficiency to achieve low latency.

SnapshotRelease DateLanguagesStatus
sonic-2024-12-12December 12, 2024en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, trStable
sonic-2024-10-19October 19, 2024en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr

Selecting a Model

When making API calls, you can specify either:

1// Use the base model (automatically routes to the latest snapshot)
2const model = "sonic-2";
3
4// Or specify a particular snapshot for consistency
5const model = "sonic-2-2025-03-07";

Continuous updates

All models have a base model name (e.g. sonic-2, sonic-turbo, sonic). We recommend using these for prototyping and development, then switching to a date-versioned model for production use cases to ensure stability.

Language Support

  1. English (en)
  2. French (fr)
  3. German (de)
  4. Spanish (es)
  5. Portuguese (pt)
  6. Chinese (zh)
  7. Japanese (ja)
  8. Hindi (hi)
  9. Italian (it)
  10. Korean (ko)
  11. Dutch (nl)
  12. Polish (pl)
  13. Russian (ru)
  14. Swedish (sv)
  15. Turkish (tr)

Future Updates

New snapshots are released periodically with improvements in performance, additional language support, and new capabilities. Check back regularly for updates.