Changelog | Cartesia

Updates to existing APIs for sonic-2-2025-04-16

Starting with sonic-2-2025-04-16, we’re removing support for:

Embeddings
stability cloning mode
Experimental controls for adjusting speed and emotion.

In our latest models, the similarity cloning mode is dramatically better than stability cloning, and we’ve solved the stability issues. Since it’s no longer worth using stability mode at all, we’ve removed it.

The experimental controls for speed and emotion negatively affect model stability in practice. We’re working on adding these controls in future model versions in a way that is more stable and effective.

To control speed and emotion in Sonic 2.0, we recommend using Instant Voice Cloning to create new Voices with the modifications you’re looking for. For example, you can speed up or slow down voices with FFMPEG, use Voice Changer to create emotive versions of voices, and make instant clones of voices generated with embeddings on sonic-2-2025-03-07.

Users who absolutely need embeddings or experimental controls should use API version 2024-11-13 with model ID sonic-2-2025-03-07, both of which are still available. Please be aware that these features could lead to reduced model stability even with that model.

Specific API changes:

sonic-2 and sonic-2-2025-04-16 will ignore experimental controls used with TTS generations
Voice cloning only supports similarity clones — this technique performs the best across-the-board on the sonic-2 model.
Removed embeddings from all endpoints.
Voices may only be specified by Voice ID
- /tts generations cannot be called with voice embeddings
Deprecated /voices/create and /voices/mix