Skip to main content
Sonic 3.5 is currently in preview on the sonic-3-latest alias. As with any -latest alias, sonic-3-latest can be updated without notice and is not recommended for production. Pin to a dated sonic-3 snapshot for production traffic.
Sonic 3.5 is designed to sound natural with minimal prompt engineering. In most cases you can pass your transcript as-is and let the model handle normalization, pacing, and expression.

Start here

  • Pass natural, well-punctuated text. Full sentences with normal capitalization and punctuation produce the best pacing and intonation. Ending each transcript with terminal punctuation (., ?, !) is recommended.
  • Let the model normalize numbers, dates, times, and alphanumerics unless you have a specific reason to override it. Sonic 3.5 handles the most common patterns natively:
    • Bare strings: 94598, ABC123
    • US phone numbers: (415) 555-1212
    • Email addresses: user@example.com
    • Dates in MM/DD/YYYY (or DD/MM/YYYY based on locale) form (e.g. 04/20/2025)
    • Times with a space before AM/PM: 7:00 PM, 7 PM, 7:00 P.M.
Note that symbols will be handled naturally, with @ signs read out as at (email addresses), and () silent (phone numbers).

Controlling pacing and spelling

When you need a string read out character-by-character (confirmation codes, order IDs, serial numbers, spelled-out names) use one of the following:
  1. Spell tags (recommended). Wrap the string in <spell>...</spell>. This is the most reliable option and works for letters, digits, and mixed alphanumerics in all supported languages.
    • Example: Your confirmation code is <spell>AB12CD</spell>.
  2. Space-delimited characters. If you prefer not to use tags, separate characters with single spaces and use commas between logical groups where a human would pause.
    • Example: Your code is A B C, 1 2 3.
If you’re migrating from Sonic 3 or an older model, the recommended delimiter format has changed. Use spaces between characters and commas between groups instead of commas between characters and periods between groups. The old format may still work but is no longer recommended.
ScenarioOld approach (comma between characters, period between groups)New approach (space between characters, comma between groups)
Spell out letters HELLOH, E, L, L, OH E L L O
Spell out digits 1234561, 2, 3, 4, 5, 61 2 3 4 5 6
Confirmation code ABC123A, B, C. 1, 2, 3.A B C, 1 2 3

Inserting pauses

Sonic 3.5 will respect natural punctuation like commas and periods. If you need a longer or specifically located pause mid-transcript, use a break tag. Break tags count as a single character and do not need surrounding whitespace. We no longer recommend the use of dashes.

Pronunciation

Proper nouns, trademarks, and domain-specific terms can be specified via custom pronunciations. This is also the right tool for disambiguating identical spellings with different pronunciations (e.g. Nice, the city, vs. nice, the adjective).

Multilingual

Match the voice to the language. Each voice has a primary language it works best with. Use the Playground to audition voices for a given language.

Streaming

Use continuations when generating chunks of audio that need to sound contiguous (for example, LLM-streamed output). This preserves prosody and voice consistency across chunk boundaries.

Speed and volume

Speed and volume controls (including SSML-based controls) are temporarily disabled on sonic-3-latest while Sonic 3.5 is in preview. If you need speed or volume augmentation today, please use a dated sonic-3 snapshot.