Skip to main content
POST
/
voices
Create Voice
curl --request POST \
  --url https://api.cartesia.ai/voices \
  --header 'Cartesia-Version: <cartesia-version>' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "name": "<string>",
  "description": "<string>",
  "embedding": [
    123
  ],
  "language": "en",
  "base_voice_id": "<string>"
}
'
{
  "id": "<string>",
  "is_owner": true,
  "name": "<string>",
  "description": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "language": "en",
  "embedding": [
    123
  ],
  "preview_file_url": "<string>"
}

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

This endpoint is deprecated! Voices created using this endpoint will not work after June 1, 2026. Please migrate to the Clone Voice endpoint instead. Reach out to support@cartesia.ai if you have any questions.

Authorizations

X-API-Key
string
header
required

Headers

Cartesia-Version
enum<string>
required

API version header.

Available options:
2024-06-10,
2024-11-13,
2025-04-16,
2026-03-01
Example:

"2024-11-13"

Body

application/json
name
string
required

The name of the voice.

description
string
required

The description of the voice.

embedding
number<double>[]
required

A 192-dimensional vector (i.e. a list of 192 numbers) that represents the voice.

language
enum<string> | null

The language that the given voice should speak the transcript in.

Options: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), Japanese (ja), Hindi (hi), Italian (it), Korean (ko), Dutch (nl), Polish (pl), Russian (ru), Swedish (sv), Turkish (tr).

Available options:
en,
fr,
de,
es,
pt,
zh,
ja,
hi,
it,
ko,
nl,
pl,
ru,
sv,
tr
base_voice_id
string | null

Pull in features from a base voice, used for features like voice mixing.

Response

200 - application/json
id
string
required

The ID of the voice.

is_owner
boolean
required

Whether your organization owns the voice.

name
string
required

The name of the voice.

description
string
required

The description of the voice.

created_at
string<date-time>
required

The date and time the voice was created.

language
enum<string>
required

The language that the given voice should speak the transcript in.

Options: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), Japanese (ja), Hindi (hi), Italian (it), Korean (ko), Dutch (nl), Polish (pl), Russian (ru), Swedish (sv), Turkish (tr).

Available options:
en,
fr,
de,
es,
pt,
zh,
ja,
hi,
it,
ko,
nl,
pl,
ru,
sv,
tr
embedding
number<double>[] | null

The vector embedding of the voice.

preview_file_url
string | null

A URL to download a preview audio file of this voice. Useful to avoid consuming credits to sample the voice. The URL requires the same Authorization header. Voice previews may be changed, moved, or deleted so you should avoid storing the URL permanently. This property will be null if there's no preview available. Only included when expand[] includes preview_file_url.