December 2025
API
- sonic-3-latest (preview) and dated sonic-3-YYYY-MM-DD snapshots.
- sonic-3-latest added to Playground TTS with banner when selected. See Changelog 2026.
Voice changes
- Voice Library — December: 25 new voices across 6 languages (12 English, 6 Hindi, 4 Arabic, 1 Spanish, 1 Japanese); 14 featured.
- Voice library changes; featured voice badge on voice page;
/voices/recentendpoint.
Playground
- Report generation (report button, alert when user reports).
- Voice move; archive and publish voices.
- PVC: custom PVC voices UI, multiple user errors surfaced to UI, feature flag for custom model during creation.
- Pronunciation dicts: new backend APIs, generator on create/edit, case sensitivity badge.
- Agents: new text-to-agent UI, create agent from Github repo tarball, system prompt generator for UI agent.
- Narrations sunset notice; TTS History pagination; auth strategy for access-tokens.
- sonic-3-latest banner and naming.
Other
- PVC, STT, and agent improvements.
- Error handling and error codes.
November 2025
API
- Improved error handling and public error responses; cache invalidation by voice ID.
- IPVC train API (remove
markAsReady); dataset files overfetch fix; default voice logic fix.
Playground
- Pronunciation dicts migrate to new backend APIs; persist visual theme to DB; PVC pipeline error and recommendations.
- Call logs conversation view default; TTS textarea height fix; Sonic-3 model for partners shown.
- Billing overage “blood bar” and alert fixes; PVC gate for Startup plan.
- Pronunciation dict generator on create/edit; API version in dialog; featured voice toggle; narrations model selection.
Line / Agents
- No user audio warning (250ms); Pipecat DeepgramNovaVADFilter.
- Call recording and artifact storage fixes.
Models / Voices
- Sonic 3 PVC and normalizer updates; LoRA and PVC error handling; expand option for dataset file count.
preview_file_url;tags_operatoron GET /voices; restrict delete to non-public voices;owner_idcheck for fine tune voices;user_errorsfor PVCs.- New Arabic accents; African French and Canadian French.
October 2025
Model changes
- Sonic 3 launch (Oct 27) — sonic-3-2025-10-27 stable snapshot released; 42 languages; volume, speed, and emotion controls.
- Real-time conversation with emotion and laughter; ~190ms median latency. See Sonic 3 and Volume, Speed, and Emotion.
Other
- Continued PVC, STT, and agent improvements; error handling and public errors; manifone voices; Sonic 3 PVC and normalizer updates.
- Transcript buffer multilingual and Thai pronunciation dictionary fix; TTFA buffering and reporting; Voice Conversion operator reload; audio norm operator.
September 2025
API
user_idtoowner_idin API (model aliasing / ownership).- Improved error handling and version/limit checks.
Line / Agents
- Warning if no user audio for 250ms+; Pipecat DeepgramNovaVADFilter for spurious
on_speech_started. - Call recording and artifact storage fixes.
Models / Voices
- STT: Migrate STT providers to Deepgram where appropriate; Deepgram for non-English or language-detect agents; word-level user text chunks.
- Sonic 3 / PVC: Sonic 3 PVC updates; Hindi Sonic 3 normalizer revert; LoRA data processing and expand option for dataset file count; PVC errors to webhook.
- Manifone new voice; African French and Canadian French accents; partner agents can configure TTS models.
Other
- LoRA bugfixes.
August 2025
API
- Production-facing agent WebSocket; cancel endpoint for ending live calls.
- Improved error handling and public error codes; cache invalidation by voice ID.
Playground
- Telephony: stop billing for customer-managed numbers; Cartesia vs Twilio param separation.
- Outbound number management columns.
Line / Agents
- Deepgram Nova VAD (
utterance_end_msconfigurable viavad_stop_secs).
Models / Voices
- New endpoint for
<audio>tags; accent column on voice API;max_buffer_delayapplied to continuations; eu-north-1 region. - GET /voices
tags_operator;preview_file_url; restrict deleting voices to non-public; checkowner_idwhen listing fine tune voices;user_errorsfor PVCs from API. - New Arabic accents migration.
Other
- Max rollover multiplier for credit plans.
July 2025
API
deploy_errorstatus fix.
Playground
- LangChain launched voice agents with Cartesia Sonic TTS.
- Billing: Stripe customer for enterprise if needed; call runtime logs in call logs side panel; Call Logs UI nits (from June work).
Line / Agents
- Partner pipeline parity with User Agent; concurrency fix (negative concurrency); agent metric LLM credit usage for evals; AgentEvaluations functionality.
- User Code Connector WS handlers fix; agent end turn handling; summarization system prompt;
user_promptin API; transcript removed from agent metric result; deadlock fix in WS timeout.
Other
- Flushing and concurrency fixes.
June 2025
API
- UserCodeAgent deployment URL; cancel endpoint for force-ending live calls via API; Agent EoUD metric; cartesia agent speed-up; user prompt stored separately in agent metrics;
agent_evaluationstable; async flush for aggregator; User Code Connector WS and last bot turn handling; deployment URL delay on pickup. - Concurrency and WS timeout fixes; improved goroutine handling; agent workers
/chatstimeout increase.
Playground
- Call Logs page for agents with data table and side panel; Agents demo with Twilio web dialer, visualizer, and like/dislike feedback; deployment detail page and list; Twilio number provisioning (Parts 1 & 2); GitConnector redeploy on commit; deployment logs; zip upload for deployment; feature flag by organization; agents gated behind feature flag; Deepgram as default STT for agents; orgs v2 (frontend and backend); 20K credits for organizations; enterprise free trial days and email invoice options.
- Credit usage: separate TTS & STT concurrency panels; STT and Infill charts; voice page copyable fields; call runtime logs in call logs panel.
Models / Voices
- STT: Whisper large v3; serve multiple models in STT pipeline; word-level user text chunks.
- FinetunedSTTContext fixes.
May 2025
API
- Voice conversion in enterprise.
Playground
- Post–April: Following April 2025 API changes (embeddings removed; use Voice IDs and Clone Voice).
Line / Agents
- User code deployments from DB;
agent_deploymentstable; STT cartesia-streaming and Pipecat streaming Whisper; Bedrock proxy for OpenAI-compatible; timestamp bug fixes and default to original timestamps. - Partner
/chatand/configupdates; DTMF support in UserCodeConnector; endpointing architecture.
Models / Voices
- STT: Batch engine utilization; Pipecat streaming Whisper.
- Deepgram STT client
url/base_urlfix.
Other
- Voice clone uploads fix.
April 2025
Breaking
- sonic-2-2025-04-16 — Starting with
sonic-2-2025-04-16, we’re removing support for: Embeddings;stabilitycloning mode; Experimental controls for speed and emotion. Thesimilaritycloning mode is dramatically better. To control speed and emotion today, use Instant Voice Cloning (e.g. FFMPEG, Voice Changer, or instant clones fromsonic-2-2025-03-07embeddings). Users who need embeddings or experimental controls can use API version2024-11-13with modelsonic-2-2025-03-07(both still available). See Older models.
API
- listVoices by ID for single voice; warm-monkey PVC; access tokens (JWT); Cartesia-Version 2024-11-13; phoneme/original timestamps language check; TTS History source; LoRA from fine-tune checkpoints; context expiry replaced by input stream delay.
sonic-2andsonic-2-2025-04-16ignore experimental controls on TTS generations; voice cloning supports onlysimilarityclones.- Removed embeddings from all endpoints; voices may only be specified by Voice ID;
/ttscannot be called with voice embeddings. - Deprecated
/voices/createand/voices/mix.
March 2025
Breaking
- sonic-2-2025-03-07 is the last Sonic 2 snapshot supporting voice embeddings and experimental controls. Use with API version
2024-11-13for legacy behavior. - sonic-preview → JollyTotem, RoseLion deprecated; sonic-2 alias to jolly-totem for speaker switching. See Older models.
API
- Cartesia-Version updated to 2024-11-13; model latency via header on bytes endpoint; new Sonic PVC model warm-monkey; listVoices by ID (single voice); access tokens (JWT signing, validation); API-level check for languages supporting phoneme and original timestamps.
- Organizations and billing; free credits 10k → 20k; overages product; subscription cache invalidation webhook; TTS History source column (api, playground, narrations); LoRA voices from base VoiceVariation and checkpoint for fine tunes.
Playground
- sonic-2 and sonic-turbo aliases launched; Sonic 2 / Sonic Turbo messaging (Turbo = 40ms latency).
- cartesia.ai/sonic and playground updates.
Line / Agents
- Agent ID in websocket URL; telephony info on partner calls; Pipecat version upgrade; partner demo tool calls; warm-monkey PVC model; prespeak and function call flow updates.
- Twilio voice routes support agent IDs; Keypad DTMF on agent; half-duplex STT and LLM context; original timestamps support in API.
Other
- sonic-pvc alias and DryVoice as sonic-pvc model. Python SDK announced.
February 2025
API
- listVoices by ID; localize endpoint voice name fix; 400s for bad body params; text forcing max transcript length; OpenAI-compatible STT server; agent with local STT; voice tags; on-device transcripts in evals; jolly-totem as default sonic-preview.
- S2S and Agents foundational blocks.
Playground
- Instant cloning enabled for free users; voice tags; localize refactored to use conditioning; listVoices can query by ID for single voice; Sarah (Similarity) and Southern Woman migrations; on-device transcripts.
- Narrations settings (JSONB).
Line / Agents
- Agent with local STT; foundational S2S + Agents blocks; design and pipeline work.
Models / Voices
- STT: cartesia-streaming and Pipecat streaming Whisper; on-device transcripts.
January 2025
API
- sonic-lite added to API; EU deployment for prod API; save option for TTS bytes handler; CORS header for Cartesia-File-ID; Stripe credits default to
char_limitin checkout; Redis cache for overage settings; polar-mountain and VC in EU; ListFiles paginator fix. - Eval break/spell tags and replacement/normalization mode.
Models / Voices
- sonic-preview routed to MisunderstoodFrog; polar-mountain added and staged; visionary-yogurt timestamp requests for any language.
- jolly-totem as default sonic-preview.