Text to Speech (SSE)
Headers
Request
The ID of the model to use for the generation. See Models for available models.
The language that the given voice should speak the transcript in.
Options: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), Japanese (ja), Hindi (hi), Italian (it), Korean (ko), Dutch (nl), Polish (pl), Russian (ru), Swedish (sv), Turkish (tr).
This feature is experimental and may not work for all voices.
Speed setting for the model. Defaults to normal
.
Influences the speed of the generated speech. Faster speeds may reduce hallucination rate.
Whether to return word-level timestamps. If false
(default), no word timestamps will be produced at all. If true
, the server will return timestamp events containing word-level timing information.
Whether to return phoneme-level timestamps. If false
(default), no phoneme timestamps will be produced - if add_timestamps
is true
, the produced timestamps will be word timestamps instead. If true
, the server will return timestamp events containing phoneme-level timing information.
Whether to use normalized timestamps (True) or original timestamps (False).