Changelog 2025 - Cartesia Docs

December 2025

API

sonic-3-latest (preview) and dated sonic-3-YYYY-MM-DD snapshots.
sonic-3-latest added to Playground TTS with banner when selected. See Changelog 2026.

Voice changes

Voice Library — December: 25 new voices across 6 languages (12 English, 6 Hindi, 4 Arabic, 1 Spanish, 1 Japanese); 14 featured.
Voice library changes; featured voice badge on voice page; /voices/recent endpoint.

Playground

Report generation (report button, alert when user reports).
Voice move; archive and publish voices.
PVC: custom PVC voices UI, multiple user errors surfaced to UI, feature flag for custom model during creation.
Pronunciation dicts: new backend APIs, generator on create/edit, case sensitivity badge.
Agents: new text-to-agent UI, create agent from Github repo tarball, system prompt generator for UI agent.
Narrations sunset notice; TTS History pagination; auth strategy for access-tokens.
sonic-3-latest banner and naming.

Other

PVC, STT, and agent improvements.
Error handling and error codes.

November 2025

API

Improved error handling and public error responses; cache invalidation by voice ID.
IPVC train API (remove markAsReady); dataset files overfetch fix; default voice logic fix.

Playground

Pronunciation dicts migrate to new backend APIs; persist visual theme to DB; PVC pipeline error and recommendations.
Call logs conversation view default; TTS textarea height fix; Sonic-3 model for partners shown.
Billing overage “blood bar” and alert fixes; PVC gate for Startup plan.
Pronunciation dict generator on create/edit; API version in dialog; featured voice toggle; narrations model selection.

Line / Agents

No user audio warning (250ms); Pipecat DeepgramNovaVADFilter.
Call recording and artifact storage fixes.

Models / Voices

Sonic 3 PVC and normalizer updates; LoRA and PVC error handling; expand option for dataset file count.
preview_file_url; tags_operator on GET /voices; restrict delete to non-public voices; owner_id check for fine tune voices; user_errors for PVCs.
New Arabic accents; African French and Canadian French.

October 2025

Model changes

Sonic 3 launch (Oct 27) — sonic-3-2025-10-27 stable snapshot released; 42 languages; volume, speed, and emotion controls.
Real-time conversation with emotion and laughter; ~190ms median latency. See Sonic 3 and Volume, Speed, and Emotion.

Other

Continued PVC, STT, and agent improvements; error handling and public errors; manifone voices; Sonic 3 PVC and normalizer updates.
Transcript buffer multilingual and Thai pronunciation dictionary fix; TTFA buffering and reporting; Voice Conversion operator reload; audio norm operator.

September 2025

API

user_id to owner_id in API (model aliasing / ownership).
Improved error handling and version/limit checks.

Line / Agents

Warning if no user audio for 250ms+; Pipecat DeepgramNovaVADFilter for spurious on_speech_started.
Call recording and artifact storage fixes.

Models / Voices

STT: Migrate STT providers to Deepgram where appropriate; Deepgram for non-English or language-detect agents; word-level user text chunks.
Sonic 3 / PVC: Sonic 3 PVC updates; Hindi Sonic 3 normalizer revert; LoRA data processing and expand option for dataset file count; PVC errors to webhook.
Manifone new voice; African French and Canadian French accents; partner agents can configure TTS models.

Other

LoRA bugfixes.

August 2025

API

Production-facing agent WebSocket; cancel endpoint for ending live calls.
Improved error handling and public error codes; cache invalidation by voice ID.

Playground

Telephony: stop billing for customer-managed numbers; Cartesia vs Twilio param separation.
Outbound number management columns.

Line / Agents

Deepgram Nova VAD (utterance_end_ms configurable via vad_stop_secs).

Models / Voices

New endpoint for <audio> tags; accent column on voice API; max_buffer_delay applied to continuations; eu-north-1 region.
GET /voices tags_operator; preview_file_url; restrict deleting voices to non-public; check owner_id when listing fine tune voices; user_errors for PVCs from API.
New Arabic accents migration.

Other

Max rollover multiplier for credit plans.

July 2025

API

deploy_error status fix.

Playground

LangChain launched voice agents with Cartesia Sonic TTS.
Billing: Stripe customer for enterprise if needed; call runtime logs in call logs side panel; Call Logs UI nits (from June work).

Line / Agents

Partner pipeline parity with User Agent; concurrency fix (negative concurrency); agent metric LLM credit usage for evals; AgentEvaluations functionality.
User Code Connector WS handlers fix; agent end turn handling; summarization system prompt; user_prompt in API; transcript removed from agent metric result; deadlock fix in WS timeout.

Other

Flushing and concurrency fixes.

June 2025

API

UserCodeAgent deployment URL; cancel endpoint for force-ending live calls via API; Agent EoUD metric; cartesia agent speed-up; user prompt stored separately in agent metrics; agent_evaluations table; async flush for aggregator; User Code Connector WS and last bot turn handling; deployment URL delay on pickup.
Concurrency and WS timeout fixes; improved goroutine handling; agent workers /chats timeout increase.

Playground

Call Logs page for agents with data table and side panel; Agents demo with Twilio web dialer, visualizer, and like/dislike feedback; deployment detail page and list; Twilio number provisioning (Parts 1 & 2); GitConnector redeploy on commit; deployment logs; zip upload for deployment; feature flag by organization; agents gated behind feature flag; Deepgram as default STT for agents; orgs v2 (frontend and backend); 20K credits for organizations; enterprise free trial days and email invoice options.
Credit usage: separate TTS & STT concurrency panels; STT and Infill charts; voice page copyable fields; call runtime logs in call logs panel.

Models / Voices

STT: Whisper large v3; serve multiple models in STT pipeline; word-level user text chunks.
FinetunedSTTContext fixes.

May 2025

API

Voice conversion in enterprise.

Playground

Post–April: Following April 2025 API changes (embeddings removed; use Voice IDs and Clone Voice).

Line / Agents

User code deployments from DB; agent_deployments table; STT cartesia-streaming and Pipecat streaming Whisper; Bedrock proxy for OpenAI-compatible; timestamp bug fixes and default to original timestamps.
Partner /chat and /config updates; DTMF support in UserCodeConnector; endpointing architecture.

Models / Voices

STT: Batch engine utilization; Pipecat streaming Whisper.
Deepgram STT client url/base_url fix.

Other

Voice clone uploads fix.

April 2025

Breaking

sonic-2-2025-04-16 — Starting with sonic-2-2025-04-16, we’re removing support for: Embeddings; stability cloning mode; Experimental controls for speed and emotion. The similarity cloning mode is dramatically better. To control speed and emotion today, use Instant Voice Cloning (e.g. FFMPEG, Voice Changer, or instant clones from sonic-2-2025-03-07 embeddings). Users who need embeddings or experimental controls can use API version 2024-11-13 with model sonic-2-2025-03-07 (both still available). See Older models.

API

listVoices by ID for single voice; warm-monkey PVC; access tokens (JWT); Cartesia-Version 2024-11-13; phoneme/original timestamps language check; TTS History source; LoRA from fine-tune checkpoints; context expiry replaced by input stream delay.
sonic-2 and sonic-2-2025-04-16 ignore experimental controls on TTS generations; voice cloning supports only similarity clones.
Removed embeddings from all endpoints; voices may only be specified by Voice ID; /tts cannot be called with voice embeddings.
Deprecated /voices/create and /voices/mix.

March 2025

Breaking

sonic-2-2025-03-07 is the last Sonic 2 snapshot supporting voice embeddings and experimental controls. Use with API version 2024-11-13 for legacy behavior.
sonic-preview → JollyTotem, RoseLion deprecated; sonic-2 alias to jolly-totem for speaker switching. See Older models.

API

Cartesia-Version updated to 2024-11-13; model latency via header on bytes endpoint; new Sonic PVC model warm-monkey; listVoices by ID (single voice); access tokens (JWT signing, validation); API-level check for languages supporting phoneme and original timestamps.
Organizations and billing; free credits 10k → 20k; overages product; subscription cache invalidation webhook; TTS History source column (api, playground, narrations); LoRA voices from base VoiceVariation and checkpoint for fine tunes.

Playground

sonic-2 and sonic-turbo aliases launched; Sonic 2 / Sonic Turbo messaging (Turbo = 40ms latency).
cartesia.ai/sonic and playground updates.

Line / Agents

Agent ID in websocket URL; telephony info on partner calls; Pipecat version upgrade; partner demo tool calls; warm-monkey PVC model; prespeak and function call flow updates.
Twilio voice routes support agent IDs; Keypad DTMF on agent; half-duplex STT and LLM context; original timestamps support in API.

Other

sonic-pvc alias and DryVoice as sonic-pvc model. Python SDK announced.

February 2025

API

listVoices by ID; localize endpoint voice name fix; 400s for bad body params; text forcing max transcript length; OpenAI-compatible STT server; agent with local STT; voice tags; on-device transcripts in evals; jolly-totem as default sonic-preview.
S2S and Agents foundational blocks.

Playground

Instant cloning enabled for free users; voice tags; localize refactored to use conditioning; listVoices can query by ID for single voice; Sarah (Similarity) and Southern Woman migrations; on-device transcripts.
Narrations settings (JSONB).

Line / Agents

Agent with local STT; foundational S2S + Agents blocks; design and pipeline work.

Models / Voices

STT: cartesia-streaming and Pipecat streaming Whisper; on-device transcripts.

January 2025

API

sonic-lite added to API; EU deployment for prod API; save option for TTS bytes handler; CORS header for Cartesia-File-ID; Stripe credits default to char_limit in checkout; Redis cache for overage settings; polar-mountain and VC in EU; ListFiles paginator fix.
Eval break/spell tags and replacement/normalization mode.

Models / Voices

sonic-preview routed to MisunderstoodFrog; polar-mountain added and staged; visionary-yogurt timestamp requests for any language.
jolly-totem as default sonic-preview.

By year

Documentation Index

​API

​Voice changes

​Playground

​Other

​API

​Playground

​Line / Agents

​Models / Voices

​Model changes

​Other

​API

​Line / Agents

​Models / Voices

​Other

​API

​Playground

​Line / Agents

​Models / Voices

​Other

​API

​Playground

​Line / Agents

​Other

​API

​Playground

​Models / Voices

​API

​Playground

​Line / Agents

​Models / Voices

​Other

​Breaking

​API

​Breaking

​API

​Playground

​Line / Agents

​Other

​API

​Playground

​Line / Agents

​Models / Voices

​API

​Models / Voices

API

Voice changes

Playground

Other

API

Playground

Line / Agents

Models / Voices

Model changes

Other

API

Line / Agents

Models / Voices

Other

API

Playground

Line / Agents

Models / Voices

Other

API

Playground

Line / Agents

Other

API

Playground

Models / Voices

API

Playground

Line / Agents

Models / Voices

Other

Breaking

API

Breaking

API

Playground

Line / Agents

Other

API

Playground

Line / Agents

Models / Voices

API

Models / Voices