Available Models
Explore the models available on Cartesia
The model ID upbeat-moon
is deprecated and will be removed on our next model release. We recommend switching to sonic-english
for staying up-to-date with our latest english model.
Sonic English
Sonic English is our latest English text-to-speech model. It demonstrates strong overall capabilities and is optimized for efficiency to achieve low latency.
Model ID: sonic-english
(Aliases to our latest english model)
Release date: May 2024
Last updated: Aug 2024
Supported languages: English
Capabilities:
- Supports abbreviations, acronyms, initialisms and phonemes (alpha).
- Supports numbers, dates, phone numbers and SSNs.
Known issues:
- Audio generations can loop or diverge on transcripts that have repeated words in succession.
- Audio generations may occasionally sound fast.
- Some long numbers and phone numbers may sound rushed as well.
Sonic Multilingual [Alpha]
Sonic Multilingual [Alpha] is our first version of multilingual text-to-speech model, demonstrating great transcript following and low latency.
Model ID: sonic-multilingual
(Aliases to our latest multilingual model)
Release date: Jun 2024
Last updated: Aug 2024
Supported languages: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), and Japanese (ja)
Capabilities:
- Supports numbers, dates, phone numbers in English, French, German, Spanish, and Chinese
Known issues:
- Some inaccuracies in numbers, dates, and phone numbers in Japanese and Portuguese.
- Audio generations may occasionally sound fast.
Recommendations:
- It is recommended to use voices from the same language as the transcript for the best performance.