Skip to main content
POST
/
tts
/
sse
Text to Speech (SSE)
curl --request POST \
  --url https://api.cartesia.ai/tts/sse \
  --header 'Cartesia-Version: <cartesia-version>' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '{
  "model_id": "<string>",
  "transcript": "<string>",
  "voice": {
    "mode": "id",
    "id": "<string>",
    "__experimental_controls": {
      "speed": 123,
      "emotion": [
        "anger:lowest"
      ]
    }
  },
  "language": "en",
  "output_format": {
    "container": "raw",
    "encoding": "pcm_f32le",
    "sample_rate": 123
  },
  "duration": 123,
  "speed": "slow",
  "add_timestamps": true,
  "add_phoneme_timestamps": true,
  "use_normalized_timestamps": true,
  "context_id": "<string>"
}'

Authorizations

X-API-Key
string
header
required

Headers

Cartesia-Version
enum<string>
required
Available options:
2024-06-10,
2024-11-13,
2025-04-16
Example:

Body

application/json
model_id
string
required
transcript
string
required
voice
object
required
  • TTSRequestIdSpecifier
  • TTSRequestEmbeddingSpecifier
output_format
object
required
language
enum<string>
Available options:
en,
fr,
de,
es,
pt,
zh,
ja,
hi,
it,
ko,
nl,
pl,
ru,
sv,
tr
duration
number | null
speed
enum<string>
Available options:
slow,
normal,
fast
add_timestamps
boolean | null
add_phoneme_timestamps
boolean | null
use_normalized_timestamps
boolean | null
context_id
string