Authorizations
Headers
API version header. Must be set to the API version, e.g. '2024-06-10'.
2024-06-10, 2024-11-13, 2025-04-16 "2024-06-10"
Body
The ID of the model to use for generating audio
The language of the transcript
The infill text to generate
The ID of the voice to use for generating audio
The format of the output audio
raw, wav, mp3 The sample rate of the output audio in Hz. Supported sample rates are 8000, 16000, 22050, 24000, 44100, 48000.
Required for raw and wav containers.
pcm_f32le, pcm_s16le, pcm_mulaw, pcm_alaw Required for mp3 containers.
Either a number between -1.0 and 1.0 or a natural language description of speed.
If you specify a number, 0.0 is the default speed, -1.0 is the slowest speed, and 1.0 is the fastest speed.
An array of emotion:level tags.
Supported emotions are: anger, positivity, surprise, sadness, and curiosity.
Supported levels are: lowest, low, (omit), high, highest.