Authorizations
An Access Token
Headers
API version header. Must be set to the API version, e.g. '2024-06-10'.
2024-06-10, 2024-11-13, 2025-04-16 "2025-04-16"
Body
Configure the various attributes of the generated speech. These are only for sonic-3 and have no effect on earlier models.
See Volume, Speed, and Emotion in Sonic-3 for a guide on this option.
The language that the given voice should speak the transcript in. For valid options, see Models.
en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr, tl, bg, ro, ar, cs, el, fi, hr, ms, sk, da, ta, uk, hu, no, vi, bn, th, he, ka, id, te, gu, kn, ml, mr, pa Use generation_config.speed for sonic-3.
Speed setting for the model. Defaults to normal.
This feature is experimental and may not work for all voices.
Influences the speed of the generated speech. Faster speeds may reduce hallucination rate.
slow, normal, fast Whether to return word-level timestamps. If false (default), no word timestamps will be produced at all. If true, the server will return timestamp events containing word-level timing information.
Whether to return phoneme-level timestamps. If false (default), no phoneme timestamps will be produced. If true, the server will return timestamp events containing phoneme-level timing information.
Whether to use normalized timestamps (True) or original timestamps (False).
The ID of a pronunciation dictionary to use for the generation. Pronunciation dictionaries are supported by sonic-3 models and newer.
Optional context ID for this request.