Models
Cartesia provides a family of state-of-the-art models, including our highly-accurate, low-latency Sonic model for text-to-speech.
Overview
Model ID | Description | Available languages |
---|---|---|
sonic | Stable alias, currently points to sonic-2024-12-12 . | en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr |
| Preview model. | en |
| December update. | en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr |
| October update. | en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr |
| [Deprecated] Please use sonic to use the most updated English model or sonic-[date] for a specific model release. sonic-english will forever point to sonic-2024-10-19 . | en |
| [Deprecated] Please use sonic to use the most updated multilingual model or sonic-[date] for a specific model release. sonic-multilingual will forever point to sonic-2024-10-19 . | en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr |
Continuous updates
sonic
and sonic-preview
are aliases for the latest stable and preview models. We recommend using these for prototyping and development, then switching to a date-versioned model for production use cases to ensure stability.
Sonic
Sonic is our flagship text-to-speech model. It produces high-accuracy, expressive speech, and is optimized for efficiency to achieve low latency.
- Model alias:
sonic
- Release date: Dec 2024
- Latest version:
sonic-2024-12-12
To learn how to use Sonic, see Make an API request.
Sonic Preview
Sonic Preview is our latest preview snapshot. It is updated regularly, and may not be as stable as sonic
. Use sonic-preview
to see the latest features and research direction from Cartesia.
The latest version of sonic-preview
demonstrates superior transcript following, especially for dates, long numbers, and phone numbers.
- Model alias:
sonic-preview
- Supported languages: en
Legacy Models
Sonic English (deprecated)
Sonic English is our original English text-to-speech model. While it is still supported in our API, we recommend using the sonic
model instead.
- Model ID:
sonic-english
- Release date: May 2024
- Latest version: Oct 2024
- Supported languages: English
Capabilities:
- Supports abbreviations, acronyms, initialisms and phonemes (alpha).
- Supports numbers, dates, phone numbers and SSNs.
Known issues:
- Audio generations can loop or diverge on transcripts that have repeated words in succession.
- Audio generations may occasionally sound fast.
- Some long numbers and phone numbers may sound rushed as well.
Sonic Multilingual (deprecated)
Sonic Multilingual was the first multilingual Sonic variant, with improved transcript following and low latency. While it is still supported in our API, we recommend using the sonic
model instead.
- Model ID:
sonic-multilingual
- Release date: Jun 2024
- Latest version: Sept 2024
Supported languages:
- English (
en
) - French (
fr
) - German (
de
) - Spanish (
es
) - Portuguese (
pt
) - Chinese (
zh
) - Japanese (
ja
) - Hindi (
hi
) - Italian (
it
) - Korean (
ko
) - Dutch (
nl
) - Polish (
pl
) - Russian (
ru
) - Swedish (
sv
) - Turkish (
tr
)
Capabilities:
- Supports numbers, dates, phone numbers in English, French, German, Spanish, and Chinese
Known issues:
- Some inaccuracies in numbers, dates, and phone numbers in Japanese and Portuguese.
- Audio generations may occasionally sound fast.
Recommendations:
- It is recommended to use voices from the same language as the transcript for the best performance.