Back to guides
Other ways to migrate and best practices for Cartesia Speech-to-Text
If you’re already using the Cartesia SDK, upgrade to version
>=3.2.0Ink 2 only supports English right now.
We expect to add more languages in the coming months.
We expect to add more languages in the coming months.
Connection
Replace the Deepgram WebSocket URL and auth header with Cartesia’s.cartesia_version query param and use a short-lived access token using the access_token query param instead of an API key.
Connect to the turn-detection WebSocket with the Cartesia SDK:
Query parameters
| Deepgram Flux | Cartesia Ink | Notes |
|---|---|---|
model=flux-general-en required | model=ink-2 required | See STT Models for all options. |
encoding=linear16 | encoding=pcm_s16le required | See encoding for all options. |
sample_rate | sample_rate required | No change. |
language_hint | — | Only English is supported right now. Multi-lingual support is coming soon! |
| — | cartesia_version=2026-03-01 required | See API Conventions for details. |
eager_eot_threshold, eot_threshold, eot_timeout_ms | — | Turn detection is controlled by the model. Configuration is coming soon! |
keyterm | — | Coming soon! |
mip_opt_out | — | Controlled by your organization. |
encoding
encoding
| Deepgram | Cartesia |
|---|---|
linear16 | pcm_s16le |
linear32 | pcm_s32le |
mulaw | pcm_mulaw |
alaw | pcm_alaw |
| Not supported | pcm_f16le |
| Not supported | pcm_f32le |
opus | Not supported |
ogg-opus | Not supported |
Sending audio
Both APIs accept raw PCM audio as binary WebSocket frames in the same way.Cartesia does not support these encodings:To close the session, send a JSON encoded WebSocket text frame:opus,ogg-opus
Configure control message since there’s no need to configure end-of-turn.
Event mapping
Deepgram type | Cartesia type |
|---|---|
Connected | connected |
Error | error |
TurnInfo | See below |
TurnInfo message with an event discriminator. Cartesia emits one message type per event, with the type on the top-level type field.
Deepgram TurnInfo.event | Cartesia type | Carries transcript? |
|---|---|---|
StartOfTurn | turn.start | No (Deepgram: yes) |
Update | turn.update | Yes |
EagerEndOfTurn | turn.eager_end | Yes |
TurnResumed | turn.resume | No (Deepgram: yes) |
EndOfTurn | turn.end | Yes |
TurnInfo message:
turn.* event:
transcript is cumulative within a turn.
Event handler
The branching structure of your handler is unchanged — just the message shape.Example Server Messages
Flux’s transcripts are joined with spaces. Ink’s are not.
| Deepgram Flux | Cartesia Realtime STT (Auto) |
|---|---|
| StartOfTurn | turn.start |
Update "Flux's transcripts" | turn.update "Flux's transcripts" |
EagerEndOfTurn "Flux's transcripts" | turn.eager_end "Flux's transcripts" |
| TurnResumed | turn.resume |
Update "Flux's transcripts are joined with spaces." | turn.update "Flux's transcripts are joined with spaces." |
EagerEndOfTurn "Flux's transcripts are joined with spaces." | turn.eager_end "Flux's transcripts are joined with spaces." |
EndOfTurn "Flux's transcripts are joined with spaces." | turn.end "Flux's transcripts are joined with spaces." |
| StartOfTurn | turn.start |
Update "Ink's are not." | turn.update " Ink's are not." |
EagerEndOfTurn "Ink's are not." | turn.eager_end " Ink's are not." |
EndOfTurn "Ink's are not." | turn.end " Ink's are not." |
References
API Reference
Cartesia Realtime STT (Auto)
Full Code Example
Using the Cartesia SDK