Clone Voice
Clone a voice from an audio clip. This endpoint has two modes, stability and similarity.
Similarity mode clones are more similar to the source clip, but may reproduce background noise. For these, use an audio clip about 5 seconds long.
Stability mode clones are more stable, but may not sound as similar to the source clip. For these, use an audio clip 10-20 seconds long.
Headers
Request
The audio clip to clone. For stability mode, the clip should be 10-20 seconds long. For similarity mode, the clip should be about 5 seconds long.
Response
The language that the given voice should speak the transcript in.
Options: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), Japanese (ja), Hindi (hi), Italian (it), Korean (ko), Dutch (nl), Polish (pl), Russian (ru), Swedish (sv), Turkish (tr).