Skip to main content
POST
/
voice-changer
/
sse
Voice Changer (SSE)
curl --request POST \
  --url https://api.cartesia.ai/voice-changer/sse \
  --header 'Cartesia-Version: <cartesia-version>' \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key: <api-key>' \
  --form clip='@example-file' \
  --form 'voice[id]=<string>' \
  --form 'output_format[container]=raw' \
  --form 'output_format[sample_rate]=123' \
  --form 'output_format[encoding]=pcm_f32le' \
  --form 'output_format[bit_rate]=123'
{
  "done": false,
  "status_code": 206,
  "step_time": 123,
  "data": "aSDinaTvuI8gbWludGxpZnk=",
  "sample_rate": 44100
}

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

X-API-Key
string
header
required

Headers

Cartesia-Version
enum<string>
required

API version header.

Available options:
2024-06-10,
2024-11-13,
2025-04-16,
2026-03-01
Example:

"2024-06-10"

Body

multipart/form-data
clip
file
voice[id]
string
output_format[container]
enum<string>
Available options:
raw,
wav,
mp3
output_format[sample_rate]
integer

The sample rate of the audio in Hz. Supported sample rates are 8000, 16000, 22050, 24000, 44100, 48000.

output_format[encoding]
enum<string> | null

Required for raw and wav containers.

Available options:
pcm_f32le,
pcm_s16le,
pcm_mulaw,
pcm_alaw
output_format[bit_rate]
integer | null

Required for mp3 containers.

Response

200 - text/event-stream

Server-sent events stream. Each frame is data: <json>\n\n where the JSON payload matches VoiceChangerSSEEvent.

Audio data chunk.

status_code
enum<integer>
required

HTTP-style status code. Always 206 for chunk events.

Available options:
206
done
enum<boolean>
required

Whether this is the final event for the request. Always false for chunk events.

Available options:
false
data
string
required

Base64-encoded audio data.

sample_rate
integer
required

The sample rate of the audio in Hz.

step_time
number
required

Server-side processing time for this chunk in milliseconds.