Skip to main content
December 2025

API

  • sonic-3-latest (preview) and dated sonic-3-YYYY-MM-DD snapshots.
  • sonic-3-latest added to Playground TTS with banner when selected. See Changelog 2026.

Voice changes

  • Voice Library — December: 25 new voices across 6 languages (12 English, 6 Hindi, 4 Arabic, 1 Spanish, 1 Japanese); 14 featured.
  • Voice library changes; featured voice badge on voice page; /voices/recent endpoint.

Playground

  • Report generation (report button, alert when user reports).
  • Voice move; archive and publish voices.
  • PVC: custom PVC voices UI, multiple user errors surfaced to UI, feature flag for custom model during creation.
  • Pronunciation dicts: new backend APIs, generator on create/edit, case sensitivity badge.
  • Agents: new text-to-agent UI, create agent from Github repo tarball, system prompt generator for UI agent.
  • Narrations sunset notice; TTS History pagination; auth strategy for access-tokens.
  • sonic-3-latest banner and naming.

Other

  • PVC, STT, and agent improvements.
  • Error handling and error codes.
November 2025

API

  • Improved error handling and public error responses; cache invalidation by voice ID.
  • IPVC train API (remove markAsReady); dataset files overfetch fix; default voice logic fix.

Playground

  • Pronunciation dicts migrate to new backend APIs; persist visual theme to DB; PVC pipeline error and recommendations.
  • Call logs conversation view default; TTS textarea height fix; Sonic-3 model for partners shown.
  • Billing overage “blood bar” and alert fixes; PVC gate for Startup plan.
  • Pronunciation dict generator on create/edit; API version in dialog; featured voice toggle; narrations model selection.

Line / Agents

  • No user audio warning (250ms); Pipecat DeepgramNovaVADFilter.
  • Call recording and artifact storage fixes.

Models / Voices

  • Sonic 3 PVC and normalizer updates; LoRA and PVC error handling; expand option for dataset file count.
  • preview_file_url; tags_operator on GET /voices; restrict delete to non-public voices; owner_id check for fine tune voices; user_errors for PVCs.
  • New Arabic accents; African French and Canadian French.
October 2025

Model changes

  • Sonic 3 launch (Oct 27)sonic-3-2025-10-27 stable snapshot released; 42 languages; volume, speed, and emotion controls.
  • Real-time conversation with emotion and laughter; ~190ms median latency. See Sonic 3 and Volume, Speed, and Emotion.

Other

  • Continued PVC, STT, and agent improvements; error handling and public errors; manifone voices; Sonic 3 PVC and normalizer updates.
  • Transcript buffer multilingual and Thai pronunciation dictionary fix; TTFA buffering and reporting; Voice Conversion operator reload; audio norm operator.
September 2025

API

  • user_id to owner_id in API (model aliasing / ownership).
  • Improved error handling and version/limit checks.

Line / Agents

  • Warning if no user audio for 250ms+; Pipecat DeepgramNovaVADFilter for spurious on_speech_started.
  • Call recording and artifact storage fixes.

Models / Voices

  • STT: Migrate STT providers to Deepgram where appropriate; Deepgram for non-English or language-detect agents; word-level user text chunks.
  • Sonic 3 / PVC: Sonic 3 PVC updates; Hindi Sonic 3 normalizer revert; LoRA data processing and expand option for dataset file count; PVC errors to webhook.
  • Manifone new voice; African French and Canadian French accents; partner agents can configure TTS models.

Other

  • LoRA bugfixes.
August 2025

API

  • Production-facing agent WebSocket; cancel endpoint for ending live calls.
  • Improved error handling and public error codes; cache invalidation by voice ID.

Playground

  • Telephony: stop billing for customer-managed numbers; Cartesia vs Twilio param separation.
  • Outbound number management columns.

Line / Agents

  • Deepgram Nova VAD (utterance_end_ms configurable via vad_stop_secs).

Models / Voices

  • New endpoint for <audio> tags; accent column on voice API; max_buffer_delay applied to continuations; eu-north-1 region.
  • GET /voices tags_operator; preview_file_url; restrict deleting voices to non-public; check owner_id when listing fine tune voices; user_errors for PVCs from API.
  • New Arabic accents migration.

Other

  • Max rollover multiplier for credit plans.
July 2025

API

  • deploy_error status fix.

Playground

  • LangChain launched voice agents with Cartesia Sonic TTS.
  • Billing: Stripe customer for enterprise if needed; call runtime logs in call logs side panel; Call Logs UI nits (from June work).

Line / Agents

  • Partner pipeline parity with User Agent; concurrency fix (negative concurrency); agent metric LLM credit usage for evals; AgentEvaluations functionality.
  • User Code Connector WS handlers fix; agent end turn handling; summarization system prompt; user_prompt in API; transcript removed from agent metric result; deadlock fix in WS timeout.

Other

  • Flushing and concurrency fixes.
June 2025

API

  • UserCodeAgent deployment URL; cancel endpoint for force-ending live calls via API; Agent EoUD metric; cartesia agent speed-up; user prompt stored separately in agent metrics; agent_evaluations table; async flush for aggregator; User Code Connector WS and last bot turn handling; deployment URL delay on pickup.
  • Concurrency and WS timeout fixes; improved goroutine handling; agent workers /chats timeout increase.

Playground

  • Call Logs page for agents with data table and side panel; Agents demo with Twilio web dialer, visualizer, and like/dislike feedback; deployment detail page and list; Twilio number provisioning (Parts 1 & 2); GitConnector redeploy on commit; deployment logs; zip upload for deployment; feature flag by organization; agents gated behind feature flag; Deepgram as default STT for agents; orgs v2 (frontend and backend); 20K credits for organizations; enterprise free trial days and email invoice options.
  • Credit usage: separate TTS & STT concurrency panels; STT and Infill charts; voice page copyable fields; call runtime logs in call logs panel.

Models / Voices

  • STT: Whisper large v3; serve multiple models in STT pipeline; word-level user text chunks.
  • FinetunedSTTContext fixes.
May 2025

API

  • Voice conversion in enterprise.

Playground

Line / Agents

  • User code deployments from DB; agent_deployments table; STT cartesia-streaming and Pipecat streaming Whisper; Bedrock proxy for OpenAI-compatible; timestamp bug fixes and default to original timestamps.
  • Partner /chat and /config updates; DTMF support in UserCodeConnector; endpointing architecture.

Models / Voices

  • STT: Batch engine utilization; Pipecat streaming Whisper.
  • Deepgram STT client url/base_url fix.

Other

  • Voice clone uploads fix.
April 2025

Breaking

  • sonic-2-2025-04-16 — Starting with sonic-2-2025-04-16, we’re removing support for: Embeddings; stability cloning mode; Experimental controls for speed and emotion. The similarity cloning mode is dramatically better. To control speed and emotion today, use Instant Voice Cloning (e.g. FFMPEG, Voice Changer, or instant clones from sonic-2-2025-03-07 embeddings). Users who need embeddings or experimental controls can use API version 2024-11-13 with model sonic-2-2025-03-07 (both still available). See Older models.

API

  • listVoices by ID for single voice; warm-monkey PVC; access tokens (JWT); Cartesia-Version 2024-11-13; phoneme/original timestamps language check; TTS History source; LoRA from fine-tune checkpoints; context expiry replaced by input stream delay.
  • sonic-2 and sonic-2-2025-04-16 ignore experimental controls on TTS generations; voice cloning supports only similarity clones.
  • Removed embeddings from all endpoints; voices may only be specified by Voice ID; /tts cannot be called with voice embeddings.
  • Deprecated /voices/create and /voices/mix.
March 2025

Breaking

  • sonic-2-2025-03-07 is the last Sonic 2 snapshot supporting voice embeddings and experimental controls. Use with API version 2024-11-13 for legacy behavior.
  • sonic-preview → JollyTotem, RoseLion deprecated; sonic-2 alias to jolly-totem for speaker switching. See Older models.

API

  • Cartesia-Version updated to 2024-11-13; model latency via header on bytes endpoint; new Sonic PVC model warm-monkey; listVoices by ID (single voice); access tokens (JWT signing, validation); API-level check for languages supporting phoneme and original timestamps.
  • Organizations and billing; free credits 10k → 20k; overages product; subscription cache invalidation webhook; TTS History source column (api, playground, narrations); LoRA voices from base VoiceVariation and checkpoint for fine tunes.

Playground

  • sonic-2 and sonic-turbo aliases launched; Sonic 2 / Sonic Turbo messaging (Turbo = 40ms latency).
  • cartesia.ai/sonic and playground updates.

Line / Agents

  • Agent ID in websocket URL; telephony info on partner calls; Pipecat version upgrade; partner demo tool calls; warm-monkey PVC model; prespeak and function call flow updates.
  • Twilio voice routes support agent IDs; Keypad DTMF on agent; half-duplex STT and LLM context; original timestamps support in API.

Other

  • sonic-pvc alias and DryVoice as sonic-pvc model. Python SDK announced.
February 2025

API

  • listVoices by ID; localize endpoint voice name fix; 400s for bad body params; text forcing max transcript length; OpenAI-compatible STT server; agent with local STT; voice tags; on-device transcripts in evals; jolly-totem as default sonic-preview.
  • S2S and Agents foundational blocks.

Playground

  • Instant cloning enabled for free users; voice tags; localize refactored to use conditioning; listVoices can query by ID for single voice; Sarah (Similarity) and Southern Woman migrations; on-device transcripts.
  • Narrations settings (JSONB).

Line / Agents

  • Agent with local STT; foundational S2S + Agents blocks; design and pipeline work.

Models / Voices

  • STT: cartesia-streaming and Pipecat streaming Whisper; on-device transcripts.
January 2025

API

  • sonic-lite added to API; EU deployment for prod API; save option for TTS bytes handler; CORS header for Cartesia-File-ID; Stripe credits default to char_limit in checkout; Redis cache for overage settings; polar-mountain and VC in EU; ListFiles paginator fix.
  • Eval break/spell tags and replacement/normalization mode.

Models / Voices

  • sonic-preview routed to MisunderstoodFrog; polar-mountain added and staged; visionary-yogurt timestamp requests for any language.
  • jolly-totem as default sonic-preview.