Migrating to the Stable API
Our stable API is now available. Our previous API (version 0) will be deprecated on June 17. All future deprecations will come with at least one month of deprecation notice.
If you run into migration issues, please message our Discord (or shared Slack channel, if you’re partnered with us). We’re happy to help you migrate.
API-wide changes
You must now pass the Cartesia-Version header with every request. (For WebSockets, you may pass thecartesia_version
query parameter.)
Remove the /v0
prefix and instead supply a Cartesia-Version
header containing the version (a date in YYYY-MM-DD format) you developed or tested your integration against. As of this documentation, the latest version is 2024-06-10
.
For WebSockets, you can alternatively specify the cartesia_version
query parameter, which will take precedence.
Voices
The /voices/{id}/embedding
endpoint has been removed in favor of just /voices/{id}
, which also returns other useful information about the voice. You should be able to just remove the /embedding
suffix; any existing code that worked with the old endpoint should work with the new one.
Text-to-Speech
Text-to-speech endpoints have all moved from /audio
to /tts
, so if you were hitting /v0/audio/websocket
, you should now hit /tts/websocket
. (Correspondingly, for Server-Sent Events, hit /tts/sse
.)
Request Format
Both the WebSocket and the Server-Sent Events endpoints now accept the same request format. (For WebSockets, send it over the WebSocket. For Server-Sent Events, send it in the request body.)
Here’s what the new request format looks like, with required fields marked:
Note that:
- Many previously optional fields are now required. Such as
model_id
,output_format
. - Output format is now passed as an object.
- Chunk time and lookahead are no longer accepted as parameters. If specified, they will be ignored.
Response Format
- We no longer return
sampling_rate
. The sampling rate passed in the request will be respected, and should be considered authoritative. If the passed sampling rate is unsupported, the API will throw an error. - We no longer return
length
. This should be easy to calculate by decoding the data.
Server-Sent Events-specific changes
- The deprecated API incorrectly implemented the Server-Sent Events response format, and was therefore rejected by standards-compliant parsers. This issue has been fixed.
- Data responses will have
"done": false
and there will now be a final done event sent in the event stream, just like in the WebSocket. The final done event has the following format:
WebSocket
- You must specify inputs directly, without wrapping them in a
data
field, as shown under #request-format. - The API key can now be specified in a header. (Browsers still don’t allow specifying headers for WebSocket connections. But this change makes using HTTP libraries that allow specifying global API key/version headers easy.)