Voice Changer (SSE)
Takes an audio file of speech, and returns an audio file of speech spoken with the same intonation, but with a different voice.
Authorizations
Cartesia API key (sk_car_...). Get one at play.cartesia.ai/keys.
Headers
API version header.
2026-03-01 "2026-03-01"
Body
Supported audio formats: flac, mp3, mpeg, mpga, oga, ogg, wav, webm
raw, wav, mp3 8000, 16000, 22050, 24000, 44100, 48000 Required for raw and wav containers.
pcm_f32le, pcm_s16le, pcm_mulaw, pcm_alaw Required for mp3 containers.
Response
Server-sent events stream. Each frame is data: <json>\n\n where the JSON payload matches VoiceChangerSSEEvent.
- VoiceChangerSSEChunk
- VoiceChangerSSEDone
- VoiceChangerSSEError
Audio data chunk.
HTTP-style status code. Always 206 for chunk events.
206 Whether this is the final event for the request. Always false for chunk events.
false Base64-encoded audio data.
The sample rate of the audio in Hz.
Server-side processing time for this chunk in milliseconds.