Where do these voices come from?
Voices created by these endpoints rely on our voice embedding models:- POST /voices
- POST /voices/mix
POST /voices/clone/clip
Creating voices
You can move to our Clone Voice API or use our web UI to create voices from 3–10 seconds of source audio. You can test these API changes by setting your Cartesia Version to2026-03-01. We recommend upgrading your Cartesia Version on production traffic before June 1 to make sure nothing breaks.
Here is an example using the Cartesia SDK: