Skip to main content
WSS
/
stt
/
websocket

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

Messages
model
type:string
required

ID of the model to use for transcription, e.g. ink-2. See Models for available models.

encoding
type:string
required

The encoding format of the audio data. This determines how the server interprets the raw binary audio data you send.

For guidance on choosing an encoding, see Audio encodings.

sample_rate
type:string
required

The sample rate of the audio in Hz.

cartesia_version
type:string
required

API version. Provide this either by adding cartesia_version=2026-03-01 as a URL query parameter or Cartesia-Version: 2026-03-01 as a request header.

Browser WebSockets do not support request headers and should add the query parameter in the URL.

X-API-Key
type:httpApiKey

API key passed in a header.

access_token
type:httpApiKey

A short-lived access token passed in a query param to make API requests from a client. This is particularly useful in the browser, where WebSockets do not support headers. See Authenticate client apps to generate an access token.

query
type:object
Send Audio Data
type:string

Send WebSocket binary messages containing raw audio data as specified by the encoding and sample_rate query parameters.

Audio Requirements:

  • Send audio in small chunks (e.g., 100ms intervals) for optimal latency
  • Audio format must match the encoding and sample_rate parameters
Finalize Command
type:string

Send finalize as a text message when the user is done speaking to receive the transcript for any buffered audio.

Example: finalize
Close Command
type:string

Send close as a text message to flush remaining audio, close session, and receive a done acknowledgment

Example: close
Transcript Response
type:object

Transcript chunks.

You should send the finalize command after the user is done speaking to make the API emit these transcript chunks; although, the API may send transcript chunks even before you send the finalize command.

Flush Done Response
type:object

Acknowledgment for the finalize command

Done Response
type:object

Acknowledgment for the close command

Error Response
type:object

Error information for STT WebSocket connections.