April 2026
Sonic 3.5
Sonic 3.5 is now available onsonic-3-latest. We’d love for you to try it and tell us what you think.Why you should try it
- More natural speech, pacing, and emotional expression, especially noticeable on expressive, conversational, and support-style transcripts.
- Cleaner audio quality across all languages and voices.
- Better alphanumeric read-out — confirmation codes, order numbers, phone numbers, IDs, and emails sound meaningfully more natural, in all supported languages.
- Step-change multilingual performance, particularly Hebrew, Japanese, Spanish, Hindi, German, Korean, and French.
- English heteronyms — tricky English heteronyms like “read,” “bass,” and “bow” now pronounce correctly in context.
How to try it
- Point your API call or Playground request to the model ID
sonic-3-latest. - Keep your existing voice IDs, request shape, and prompting — no code changes required for most customers.
- Send us feedback on any voice or transcript that behaves differently than you expect.
As with any
-latest alias, sonic-3-latest can be updated without notice and is not recommended for production. Pin to a dated snapshot (e.g. sonic-3) for production traffic.What to know to be successful
- Spell tags still work the same way. If you already wrap alphanumerics in
<spell>...</spell>, you don’t need to change anything — you’ll just get better-sounding output. See here for more details. - If you use custom delimiters (commas/periods between characters or groups) to control pacing, our recommended format has changed. Use spaces between characters and commas between groups, e.g.
A B C, 1 2 3instead ofA, B, C. 1, 2, 3.. See Prompting tips for Sonic 3.5 for more details. - Speed and volume controls are temporarily disabled on
sonic-3-latest. If you rely on speed or volume augmentation (including via SSML), stay onsonic-3for now. We believe that Sonic 3.5 has more natural pacing and you may find that you don’t need to use speed control as much when using this model. - Timestamps behave slightly differently. If you use end-of-word timestamps for interruption handling, you should not see a meaningful change. If you depend on beginning-of-word timestamps, please test carefully and reach out if you see regressions for your use case.
- Existing Professional Voice Clones (PVCs) do not carry over to
sonic-3-latest. Professional Voice Clones are pinned to the base model they were trained on (e.g.sonic-3) and will function as a standard voice clone for this model. For more information, see Clone Voices (Pro).
Where to look for help
March 2026
Breaking
- Text-to-Agent (T2A) API — Text-to-Agent workflow for Line is deprecated.
API
- Error responses — For
Cartesia-Version: 2026-03-01, we now return structured JSON. See API Errors.- API versions before
2026-03-01continue to return legacy error formats (for example HTTPTitle: Message). - Voices —
PATCH /voices/{id}: voice owners can now update accent and gender. Voice creation validates language. Invalid voice UUIDs and pronunciation-dictionary IDs return 404 instead of ambiguous errors.
- API versions before
- PVC model routing — PVC voices require a dated model ID (e.g.
sonic-3-2026-01-12) instead ofsonic-3. See Clone Voices (Pro). - Voice search — Name and metadata search is diacritics-insensitive.
Playground
- Pro voice clones
- Clearer language mismatch messaging
- Background noise removal is now a simple on/off control
- Fine-tuning model support:
- Removed support for older models
- Now only sonic-3-2026-01-12 is supported
- Multilingual agents — Multilingual agent configuration is now supported in the Playground.
- Agents UI — Search by call ID and agent ID.
Billing
- Concurrency — Organizations can receive notifications when concurrency nears configured limits.
Model / voice
- Professional Voice Clones — Backend updates improve stability of the professional voice cloning workflow.
- Accents & filters — Additional accent options (e.g. Irish, New Zealand, South African, Belgian) and locale aliases for accent filtering in APIs and Playground.
- Voice Library — 94 new voices across 17 locales (including Arabic, German, English variants, Spanish, Finnish, French, Hebrew, Hindi, Japanese, Korean, Polish, Portuguese, Swedish, Telugu, Thai, and more).
Self-hosted
- On-premises — API for managing voices on self-hosted deployments.
Cartesia SDK
-
cartesia-js v3.0.0 (Mar 2) — Major updates:
- New features:
flush_idincluded in chunk and voice changer binary responses;output_formatand infill support; inline WebSocket response types; byte endpoint returns ArrayBuffer; improved WebPlayer and client export. - Fixes: memory leak and timing issues with abort signals/listeners, handling of empty
Content-Length, and TimeoutError now includes a message.
- New features:
February 2026
Line
- History Management API: You can add or replace the history provided to your agent, for example, to summarize a long conversation.
- Custom User Events: You can send bidirectional custom events between your client and the agent. You could use this, for example, if you have a web application with UI interactions.
- Uninterruptible Messages: You can set messages as uninterruptible. A common use case is a legal disclaimer at the beginning of a call.
- End Tool Call Improvements: The default end call tool call is more conservative to prevent calls from ending prematurely.
API
- Increased reliability of API connections
Cartesia SDK
- cartesia-python v3.0.0 (Feb 9). See full details in cartesia-python releases.
Playground
- Shipped a new TTS page
- Shipped a new Voice Creation page
- Shipped a new Agents page
Model changes
- Improved pronunciation of real-world text patterns across languages
- Enhanced support for structured and formatted speech patterns: numbers, dates, times, currency, phone numbers, IDs, percentages, and amounts/measurements.
- Support for various date formats (YYYY-MM-DD, YYYY/MM/DD, 年月日).
- Support for measurement units (meters, kg, tablespoon, gigabytes, etc.) with locale awareness.
- Support for domestic and international phone number formats with locale-specific chunking for French, Italian, German, Portuguese, Korean, and more.
- Improved alphanumeric ID handling with katakana/hiragana readings and Latin acronym transliteration to katakana for Japanese.
- Improves all languages except English, Hindi & other Indic languages, Arabic, Hebrew, Chinese, Swedish, Georgian, Bulgarian, and Tagalog (targeted for future updates).
- Support for regional and locale-specific pronunciation within languages
- Regional voices use region-specific terms in addition to accent (e.g. Belgian and Swiss French “nonante” vs. Canadian and French “quatre-vingt-dix”).
- Region-specific number terminology, currency symbols, date formats, and measurement units.
- Locale-aware date and time formatting (e.g. Russian year suffixes, French/Spanish time conventions).
- Locale-aware currency symbol handling (e.g. $ as “dollars” in en_US and “pesos” in es_MX).
- Locale pronunciation falls back to the primary country for that language (e.g. US for English, Brazil for Portuguese). We will continue to expand locale-aware support.
- Improves all languages except English, Hindi & other Indic languages, Arabic, Hebrew, Chinese, Swedish, Georgian, Bulgarian, and Tagalog (targeted for future updates). Existing regional pronunciation for English voices (e.g. British) is unaffected.
Voice changes
- Voice Library: 39 new voices across 21 locales
Breaking changes effective June 1, 2026
The following model snapshots and languages are discontinued effective June 1, 2026:| Model | Snapshots | Languages |
|---|---|---|
sonic | All | All |
sonic-english | — | All |
sonic-multilingual | — | All |
sonic-2 | sonic-2-2025-04-16, sonic-2-2025-05-08, sonic-2-2025-06-11 | it, nl, pl, ru, sv, tr, hi |
sonic-2-2025-03-07 | All | |
sonic-turbo | sonic-turbo-2025-06-04 | it, nl, pl, ru, sv, tr |
sonic-turbo-2025-03-07 | All |
| Discontinued Endpoint | Replacement |
|---|---|
Voice Embedding: POST /voices/clone/clip | Clone Voice |
Mix Voices: POST /voices/mix | — |
Create Voice: POST /voices | Clone Voice |
| Endpoint with a breaking change | Replacement |
|---|---|
TTS (bytes): POST /tts/bytes | Voice ID |
TTS (SSE): POST /tts/sse | Voice ID |
TTS (WebSocket): WSS /tts/websocket | Voice ID |
January 2026
API
- Regionalization — Calls routed to US, EU, APAC by origin.
- Parameterized outbound calls — Docs
- Pronunciation dictionaries — Docs
Model changes
- Sonic-3 model versioning scheme introduced
- New preview track:
sonic-3-latest(continuous updates for early access and feedback). - Stable track:
sonic-3always points to the most recent stable release. - Immutable dated snapshots:
sonic-3-YYYY-MM-DDnever change. - Details: Continuous updates and model snapshots
- New preview track:
- Promotion to stable checkpoint:
sonic-3-2026-01-12- Included improvements: consistent speed & volume, custom IPA pronunciations with stronger adherence, Hindi prosody improvements, Korean prosody/intonation improvements.
Voice changes
- Featured Voices launched — Curated set of 30+ best-performing voices (e.g. Cathy, Henry).
- Voice Library — December: 25 new voices across 6 languages.
- Voice Library — January: 9 Spanish voices (Mexican, Colombian, Castilian).
Playground
- Voice library usability improvements (test with your own scripts, call an agent per voice).
- One-click Report Issue on TTS Playground.
- Mini voice picker (recently used + saved) on TTS page.
- PVC UI + reliability (loading skeletons, error messages, better behavior with large datasets and silence).
Line
- Line SDK v0.2 — Repo. Improved DX, long-running tool-call handling, committed turns, better turn-taking and transcription.