Clone Voice - Cartesia Docs

curl --request POST \ --url https://api.cartesia.ai/voices/clone \ --header 'Cartesia-Version: <cartesia-version>' \ --header 'Content-Type: multipart/form-data' \ --header 'X-API-Key: <api-key>' \ --form clip='@example-file' \ --form 'name=<string>' \ --form 'description=<string>' \ --form 'base_voice_id=<string>' \ --form enhance=true

Authorizations

X-API-Key

string

header

required

Headers

Cartesia-Version

enum<string>

required

API version header.

Available options:

2024-11-13

Example:

"2024-11-13"

Body

multipart/form-data

clip

file

required

See Clone Voices for guidance on choosing a clip.

Supported audio formats: flac, mp3, mpeg, mpga, oga, ogg, wav, webm

name

string

required

The name of the voice.

language

enum<string>

required

The language of the voice.

Available options:

en,

fr,

de,

es,

pt,

zh,

ja,

hi,

it,

ko,

nl,

pl,

ru,

sv,

tr

description

string | null

A description for the voice.

base_voice_id

string | null

Optional base voice ID that the cloned voice is derived from.

mode

enum<string>

deprecated

No longer used

Available options:

similarity,

stability

enhance

boolean | null

deprecated

Whether to apply AI enhancements to the clip to reduce background noise. This leads to cleaner generated speech at the cost of reduced similarity to the source clip.

Response

200 - application/json

string

required

The ID of the voice.

user_id

string

required

The ID of the user who owns the voice.

is_public

boolean

required

Whether the voice is publicly accessible.

name

string

required

The name of the voice.

description

string

required

The description of the voice.

created_at

string<date-time>

required

The date and time the voice was created.

language

string

required

The voice's language, as an ISO 639-1 code (e.g. en, fr, zh)

Example:

"en"