> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Changelog 2026

> Product, API, and platform changes for 2026

<Update label="May 2026">
  ### Speech-to-Text

  * **Ink-2, our state-of-the-art streaming STT model** — Build responsive real-time voice experiences with built-in turn detection and accurate transcription even in noisy environments. It currently supports only English, with additional languages coming later.
    * Try it on the [Cartesia Playground](https://play.cartesia.ai/speech-to-text).
    * Integrate via [API](/api-reference/stt/turns/websocket), [Python](https://pypi.org/project/cartesia/), [TypeScript/JavaScript](https://www.npmjs.com/package/@cartesia/cartesia-js), [LiveKit](https://github.com/livekit/agents/blob/99d81a1dd3a026ee3e708dcde74ee8c224fc2a57/examples/other/cartesia.py), and [PipeCat](https://github.com/pipecat-ai/pipecat/blob/e707807d19828213762ff5591f40431ab89fbfc3/examples/voice/voice-cartesia-turns.py).
    * Switching from Deepgram Flux? See the [migration guide](/use-the-api/stt/migrate-from-deepgram-flux).

  ### Text-to-Speech

  * **Sonic 3.5 is now generally available** — Our most natural, expressive TTS model is out of preview and production-ready. Use the `sonic-3.5` alias for the latest stable snapshot. See the [Sonic 3.5 model overview](/build-with-cartesia/tts-models/latest).
    * Switching from Sonic 3? See [Migrating from Sonic 3 to Sonic 3.5](/build-with-cartesia/tts-models/sonic-3-to-sonic-3-5) for what's new and what to check before moving production traffic.
  * **Speed and volume controls** — Dial speed and volume up or down so voices sound the way you want. See the [speed and volume guide](/build-with-cartesia/capability-guides/volume-speed-emotion#speed-and-volume-controls).

  ### Line / Agents

  * **More natural conversations** — Eligible Line agents run on Sonic 3.5 (TTS) and Ink 2 (STT) by default, improving naturalness, pacing, latency, and turn-taking. No config change needed.
  * **Bring your own Twilio account** — Connect your Twilio account and import your existing phone numbers. You can still use the free Cartesia-provisioned numbers included in your plan. See the [Twilio integration guide](/line/integrations/telephony/twilio/integration).
  * **SIP trunking (Beta)** — Connect your existing phone system directly to Cartesia's voice agents using SIP (Session Initiation Protocol) trunking. Reach out at [support@cartesia.ai](mailto:support@cartesia.ai) for early access.
  * **Phone number and provider APIs** — Provision, import, and configure phone numbers and providers via API. See the [phone numbers API](/api-reference/agents/phone-numbers/import).

  ### Voices

  * **Filter voices by locale** — Find voices with the right accent by passing a locale (e.g. `en-GB`) into the language field when listing voices. The API response now includes a `country` field (e.g. `GB`) to make each voice's regional accent easier to identify. See the [voices API reference](/api-reference/voices/list#response-data-items-country-one-of-0).
  * **57 new voices across 11 locales** — Added 57 new voices in the [voice library](https://play.cartesia.ai/voices), including `ar-AE`, `de-DE`, `en-CA`, `en-GB`, `en-NZ`, `en-US`, `en-ZA`, `es-MX`, `fr-CA`, `he-IL`, and `th-TH`.
</Update>

<Update label="April 2026">
  ### Sonic 3.5

  *Sonic 3.5 is now available on `sonic-3-latest`. We'd love for you to try it and tell us what you think.*

  #### Why you should try it

  * **More natural speech, pacing, and emotional expression**, especially noticeable on expressive, conversational, and support-style transcripts.
  * **Cleaner audio quality** across all languages and voices.
  * **Better alphanumeric read-out** — confirmation codes, order numbers, phone numbers, IDs, and emails sound meaningfully more natural, in all supported languages.
  * **Step-change multilingual performance**, particularly Hebrew, Japanese, Spanish, Hindi, German, Korean, and French.
  * **English heteronyms** — tricky English heteronyms like "read," "bass," and "bow" now pronounce correctly in context.

  #### How to try it

  1. Point your API call or Playground request to the model ID `sonic-3-latest`.
  2. Keep your existing voice IDs, request shape, and prompting — no code changes required for most customers.
  3. Send us feedback on any voice or transcript that behaves differently than you expect.

  <Note>
    As with any `-latest` alias, `sonic-3-latest` can be updated without notice and is not recommended for production. Pin to a dated snapshot (e.g. `sonic-3`) for production traffic.
  </Note>

  #### What to know to be successful

  * **Spell tags still work the same way.** If you already wrap alphanumerics in `<spell>...</spell>`, you don't need to change anything — you'll just get better-sounding output. See [Prompting Tips](/build-with-cartesia/capability-guides/prompting-tips) for more details.
  * **If you use custom delimiters** (commas/periods between characters or groups) to control pacing, our recommended format has changed. Use **spaces between characters** and **commas between groups**, e.g. `A B C, 1 2 3` instead of `A, B, C. 1, 2, 3.`. See [Prompting Tips](/build-with-cartesia/capability-guides/prompting-tips) for more details.
  * **Speed and volume controls are temporarily disabled** on `sonic-3-latest`. If you rely on speed or volume augmentation (including via SSML), stay on `sonic-3` for now. We believe that Sonic 3.5 has more natural pacing and you may find that you don't need to use speed control as much when using this model.
  * **Timestamps behave slightly differently.** If you use end-of-word timestamps for interruption handling, you should not see a meaningful change. If you depend on beginning-of-word timestamps, please test carefully and reach out if you see regressions for your use case.
  * **Existing Professional Voice Clones (PVCs) do not carry over to `sonic-3-latest`.** Professional Voice Clones are pinned to the base model they were trained on (e.g. `sonic-3`) and will function as a standard voice clone for this model. For more information, see [Pro Voice Clone](/build-with-cartesia/capability-guides/clone-voices-pro).
  * **Providing proper context to the model improves naturalness.** Please see our buffering guide [here](/use-the-api/tts-websocket/buffering) for more details.

  #### Where to look for help

  * [Sonic 3.5 model overview](/build-with-cartesia/tts-models/latest)
  * [Prompting tips for Sonic 3.5](/build-with-cartesia/capability-guides/prompting-tips)
  * [Model aliases and snapshots](/build-with-cartesia/tts-models/latest#continuous-updates)

  ### API

  * **Usage and API keys** — New HTTP APIs for **usage** and **API keys**.
  * **Speech-to-text (STT)** — Improved documentation. See [STT streaming](/api-reference/stt/websocket).

  ### Playground

  * **Improved call details experience** — **Click on a transcript** to seek audio when reviewing calls.
  * **Cancel call** — You can now **cancel active calls** from the Playground, for example, if you mistakenly made outbound calls.
  * **Keys** — One **Keys** screen with **Standard** and **Admin** tabs when your org has access.
  * **Pronunciation dictionaries** — In-app **list and detail** views for dictionaries tied to your organization.

  ### Line / Agents

  * **LLM provider** — Agent inference paths standardize on **Anthropic**; setup copy and defaults no longer point voice agents at **Gemini** keys.
  * **OpenAI WebSocket mode** — We now support OpenAI's WebSocket mode, which offers **low latency** for agent inference.
  * **Transfer and end call interruption** — In the Line SDK, you can set **transfer** and **end call** as **uninterruptible**.

  ### Models / Voices

  * **Voice Library** — **34** new voices across **10** locales (`ar-001`, `de-DE`, `en-US`, `en-AU`, `he-IL`, `hi-IN`, `ko-KR`, `tl-PH`, `ta-IN`, `te-IN`).
  * **Voice cloning** — More reliable uploads for **M4A** (and similar) source clips when creating clones.

  ### Self-hosted

  * **Playground** — Add voices to your on-prem deployment.
  * **Pronunciation dictionaries** — **`POST /onprem/add-pdict`** to import dictionaries from cloud into self-hosted stacks.
  * **STT** — Optional streaming STT via your **configured provider** integration in self-hosted environments.
</Update>

<Update label="March 2026">
  ### Breaking

  * **Text-to-Agent (T2A) API** — Text-to-Agent workflow for Line is **deprecated**.

  ### API

  * **Error responses** — For `Cartesia-Version: 2026-03-01`, we now return structured JSON. See [API Errors](/use-the-api/api-errors).
    * API versions before `2026-03-01` continue to return legacy error formats (for example HTTP `Title: Message`).
    * **Voices** — `PATCH /voices/{id}`: voice owners can now update accent and gender. Voice creation validates language. Invalid voice UUIDs and pronunciation-dictionary IDs return 404 instead of ambiguous errors.
  * **PVC model routing** — PVC voices require a dated model ID (e.g. **`sonic-3-2026-01-12`**) instead of **`sonic-3`**. See [Pro Voice Clone](/build-with-cartesia/capability-guides/clone-voices-pro).
  * **Voice search** — Name and metadata search is **diacritics-insensitive**.

  ### Playground

  * **Pro voice clones**
    * Clearer **language mismatch** messaging
    * **Background noise removal** is now a simple on/off control
    * **Fine-tuning model support**:
      * Removed support for older models
      * Now only **sonic-3-2026-01-12** is supported
  * **Multilingual agents** — Multilingual agent configuration is now supported in the Playground.
  * **Agents UI** — Search by **call ID** and **agent ID**.

  ### Billing

  * **Concurrency** — Organizations can receive **notifications** when concurrency nears configured **limits**.

  ### Model / voice

  * **Professional Voice Clones** — Backend updates improve stability of the professional voice cloning workflow.
  * **Accents & filters** — Additional **accent** options (e.g. **Irish**, **New Zealand**, **South African**, **Belgian**) and **locale aliases** for accent filtering in APIs and Playground.
  * **Voice Library** — **94** new voices across **17** locales (including Arabic, German, English variants, Spanish, Finnish, French, Hebrew, Hindi, Japanese, Korean, Polish, Portuguese, Swedish, Telugu, Thai, and more).

  ### Self-hosted

  * **On-premises** — API for managing voices on self-hosted deployments.

  ### Cartesia SDK

  * **cartesia-js v3.0.0** (Mar 2) — Major updates:

    * New features: `flush_id` included in chunk and voice changer binary responses; `output_format` and infill support; inline WebSocket response types; byte endpoint returns **ArrayBuffer**; improved **WebPlayer** and client export.
    * Fixes: memory leak and timing issues with abort signals/listeners, handling of empty `Content-Length`, and **TimeoutError** now includes a message.

    See [cartesia-js releases](https://github.com/cartesia-ai/cartesia-js/releases) for full details.
</Update>

<Update label="February 2026">
  ### Line

  * **[History Management API](/line/sdk/agents#history-management)**: You can add or replace the history provided to your agent, for example, to summarize a long conversation.
  * **[Custom User Events](/line/sdk/events#custom-event)**: You can send bidirectional custom events between your client and the agent. You could use this, for example, if you have a web application with UI interactions.
  * **[Uninterruptible Messages](/line/sdk/events#speech)**: You can set messages as uninterruptible. A common use case is a legal disclaimer at the beginning of a call.
  * **End Tool Call Improvements**: The default end call tool call is more conservative to prevent calls from ending prematurely.

  ### API

  * Increased reliability of API connections

  ### Cartesia SDK

  * **cartesia-python v3.0.0** (Feb 9). See full details in [cartesia-python releases](https://github.com/cartesia-ai/cartesia-python/releases).

  ### Playground

  * Shipped a new TTS page
  * Shipped a new Voice Creation page
  * Shipped a new Agents page

  ### Model changes

  * **Improved pronunciation of real-world text patterns across languages**
    * Enhanced support for structured and formatted speech patterns: numbers, dates, times, currency, phone numbers, IDs, percentages, and amounts/measurements.
    * Support for various date formats (YYYY-MM-DD, YYYY/MM/DD, 年月日).
    * Support for measurement units (meters, kg, tablespoon, gigabytes, etc.) with locale awareness.
    * Support for domestic and international phone number formats with locale-specific chunking for French, Italian, German, Portuguese, Korean, and more.
    * Improved alphanumeric ID handling with katakana/hiragana readings and Latin acronym transliteration to katakana for Japanese.
    * Improves all languages except English, Hindi & other Indic languages, Arabic, Hebrew, Chinese, Swedish, Georgian, Bulgarian, and Tagalog (targeted for future updates).
  * **Support for regional and locale-specific pronunciation within languages**
    * Regional voices use region-specific terms in addition to accent (e.g. Belgian and Swiss French "nonante" vs. Canadian and French "quatre-vingt-dix").
    * Region-specific number terminology, currency symbols, date formats, and measurement units.
    * Locale-aware date and time formatting (e.g. Russian year suffixes, French/Spanish time conventions).
    * Locale-aware currency symbol handling (e.g. \$ as "dollars" in en\_US and "pesos" in es\_MX).
    * Locale pronunciation falls back to the primary country for that language (e.g. US for English, Brazil for Portuguese). We will continue to expand locale-aware support.
    * Improves all languages except English, Hindi & other Indic languages, Arabic, Hebrew, Chinese, Swedish, Georgian, Bulgarian, and Tagalog (targeted for future updates). Existing regional pronunciation for English voices (e.g. British) is unaffected.

  ### Voice changes

  * **Voice Library**: 39 new voices across 21 locales

  ### Breaking changes effective June 1, 2026

  The following model snapshots and languages are discontinued effective June 1, 2026:

  | Model                | Snapshots                                                        | Languages                  |
  | -------------------- | ---------------------------------------------------------------- | -------------------------- |
  | `sonic`              | All                                                              | All                        |
  | `sonic-english`      | —                                                                | All                        |
  | `sonic-multilingual` | —                                                                | All                        |
  | `sonic-2`            | `sonic-2-2025-04-16`, `sonic-2-2025-05-08`, `sonic-2-2025-06-11` | it, nl, pl, ru, sv, tr, hi |
  |                      | `sonic-2-2025-03-07`                                             | All                        |
  | `sonic-turbo`        | `sonic-turbo-2025-06-04`                                         | it, nl, pl, ru, sv, tr     |
  |                      | `sonic-turbo-2025-03-07`                                         | All                        |

  The following endpoints are discontinued effective June 1, 2026:

  | Discontinued Endpoint                      | Replacement                                |
  | ------------------------------------------ | ------------------------------------------ |
  | Voice Embedding: `POST /voices/clone/clip` | [Clone Voice](/api-reference/voices/clone) |
  | Mix Voices: `POST /voices/mix`             | —                                          |
  | Create Voice: `POST /voices`               | [Clone Voice](/api-reference/voices/clone) |

  The following endpoints stop accepting voice embeddings effective June 1, 2026:

  | Endpoint with a breaking change       | Replacement |
  | ------------------------------------- | ----------- |
  | TTS (bytes): `POST /tts/bytes`        | Voice ID    |
  | TTS (SSE): `POST /tts/sse`            | Voice ID    |
  | TTS (WebSocket): `WSS /tts/websocket` | Voice ID    |
</Update>

<Update label="January 2026">
  ### API

  * **Regionalization** — Calls routed to US, EU, APAC by origin.
  * **Parameterized outbound calls** — [Docs](/line/integrations/telephony/outbound-dialing)
  * **Pronunciation dictionaries** — [Docs](/line/sdk/agents#custom-pronunciations)

  ### Model changes

  * **Sonic-3 model versioning scheme introduced**
    * New preview track: **`sonic-3-latest`** (continuous updates for early access and feedback).
    * Stable track: **`sonic-3`** always points to the most recent stable release.
    * Immutable dated snapshots: **`sonic-3-YYYY-MM-DD`** never change.
    * Details: [Continuous updates and model snapshots](/build-with-cartesia/tts-models/latest#continuous-updates)
  * **Promotion to stable checkpoint:** **`sonic-3-2026-01-12`**
    * Included improvements: consistent speed & volume, custom IPA pronunciations with stronger adherence, Hindi prosody improvements, Korean prosody/intonation improvements.

  ### Voice changes

  * **Featured Voices launched** — Curated set of 30+ best-performing voices (e.g. [Cathy](https://play.cartesia.ai/voices/e8e5fffb-252c-436d-b842-8879b84445b6), [Henry](https://play.cartesia.ai/voices/87286a8d-7ea7-4235-a41a-dd9fa6630feb)).
  * **Voice Library** — December: 25 new voices across 6 languages.
  * **Voice Library** — January: 9 Spanish voices (Mexican, Colombian, Castilian).

  ### Playground

  * Voice library usability improvements (test with your own scripts, call an agent per voice).
  * One-click **Report Issue** on TTS Playground.
  * Mini voice picker (recently used + saved) on TTS page.
  * PVC UI + reliability (loading skeletons, error messages, better behavior with large datasets and silence).

  ### Line

  * **Line SDK v0.2** — [Repo](https://github.com/cartesia-ai/line). Improved DX, long-running tool-call handling, **committed turns**, better turn-taking and transcription.
</Update>