Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

All models in the Sonic TTS family support custom pronunciations in your transcripts. Try out the pronunciation tool on our demo page.
sonic-3.5 and sonic-3 support custom pronunciation dictionaries, which let you specify how to pronounce a specific word or phrase.A dictionary is a simple search and replace, which directs the model to use another string in lieu of the text from the transcript. The pronunciation can be either an IPA pronunciation or a “sounds-like” guidance:
[
  {
    "text": "bayou",
    "pronunciation": "<<ˈ|b|ɑ|ˈ|j|u>>"
  },
  {
    "text": "jambalaya",
    "pronunciation": "<<ˈ|dʒ|ə|m|ˈ|b|ə|ˈ|l|aɪ|ˈ|ə>>"
  },
  {
    "text": "tchoupitoulas",
    "pronunciation": "chop-uh-TOO-liss"
  }
]
The legacy alias field is deprecated. Use pronunciation for new dictionary items.
Save these JSONs as pronunciation dictionaries through our API or through our playground:image.pngOnce a dictionary is created, use it in any TTS API by passing its id as pronunciation_dict_id.With the dictionary above, the string I ate some jambalaya on tchoupitoulas street becomes I ate some <<ˈ|dʒ|ə|m|ˈ|b|ə|ˈ|l|aɪ|ˈ|ə>> on chop-uh-TOO-liss street before being handed off to the model.

Case Sensitivity

Dictionary matching is case-sensitive, with one exception: a lowercase entry also matches its sentence-start capitalized form. For example, cat matches both cat and Cat, but not CAT. An entry for CAT only matches CAT.This applies to multi-word entries too. An entry for green valley matches green valley and Green valley, but not Green Valley.Use lowercase entries for common words. These match the word both mid-sentence (cat) and at the start of a sentence (Cat), covering the two most common positions.Use exact capitalization for proper nouns. A term like LaTeX should be entered as LaTeX so it doesn’t collide with a different pronunciation for the common word latex. For multi-word proper nouns, enter the exact casing as it appears in your transcripts — for example, Green Valley if the transcript capitalizes both words.