Available Models

Explore the models available on Cartesia

The model ID upbeat-moon is deprecated and will be removed on our next model release. We recommend switching to sonic-english for staying up-to-date with our latest english model.

Sonic English

Sonic English is our latest English text-to-speech model. It demonstrates strong overall capabilities and is optimized for efficiency to achieve low latency.

Model ID: sonic-english (Aliases to our latest english model)

Release date: May 2024

Last updated: Aug 2024

Supported languages: English

Capabilities:

  • Supports abbreviations, acronyms, initialisms and phonemes (alpha).
  • Supports numbers, dates, phone numbers and SSNs.

Known issues:

  • Audio generations can loop or diverge on transcripts that have repeated words in succession.
  • Audio generations may occasionally sound fast.
  • Some long numbers and phone numbers may sound rushed as well.

Sonic Multilingual [Alpha]

Sonic Multilingual [Alpha] is our first version of multilingual text-to-speech model, demonstrating great transcript following and low latency.

Model ID: sonic-multilingual (Aliases to our latest multilingual model)

Release date: Jun 2024

Last updated: Aug 2024

Supported languages: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), and Japanese (ja)

Capabilities:

  • Supports numbers, dates, phone numbers in English, French, German, Spanish, and Chinese

Known issues:

  • Some inaccuracies in numbers, dates, and phone numbers in Japanese and Portuguese.
  • Audio generations may occasionally sound fast.

Recommendations:

  • It is recommended to use voices from the same language as the transcript for the best performance.