MCP - Cartesia Docs

Use cartesia-mcp in Cursor, Claude Code, or other MCP clients to list voices, run TTS and STT, manage pronunciation dictionaries, download cloud-stored generations, and more — no custom scripts.

Setup

Pick your client, install, and sign in when prompted. A Cartesia MCP API key is created for your organization if one does not exist yet.

Cursor
Claude Code

Install in Cursor

Click install, then sign in to the Playground when your browser opens.

claude mcp add --transport http --scope user cartesia-mcp https://mcp.cartesia.ai/mcp

Run /mcp, select cartesia-mcp, and sign in when prompted.

Try it

Ask your agent things like:

List all available Cartesia voices
Convert text to audio with a chosen voice (speed, volume, emotion) — returns file_id and usually a shareable download_url
Transcribe an audio file to text
Create a pronunciation dictionary and use it in TTS
Check credit usage for your account
Localize an existing voice into another language
Change an audio file to use a different voice

Tools

Tool	Description
`text_to_speech`	Convert text to audio; default `save=true` returns `file_id` and usually a 24-hour `download_url` (plus local `file_path` when running MCP locally)
`speech_to_text`	Transcribe an audio file; default `mode=batch` (batch STT), or `mode=stream` (STT WebSocket)
`list_voices`	List available voices (filter by language, search, gender, etc.)
`get_voice`	Fetch metadata for a voice by ID
`clone_voice`	Clone a voice from an audio sample
`update_voice`	Update a cloned voice’s name or description
`delete_voice`	Delete a cloned voice
`voice_change`	Re-render audio with a different voice
`localize_voice`	Adapt a voice to another language or dialect
`list_pronunciation_dicts`	List pronunciation dictionaries
`create_pronunciation_dict`	Create a pronunciation dictionary
`get_pronunciation_dict`	Get a pronunciation dictionary by ID
`update_pronunciation_dict`	Update a pronunciation dictionary
`delete_pronunciation_dict`	Delete a pronunciation dictionary
`download_file`	Fetch a cloud file by `file_id` (from `text_to_speech` or playground history); returns a fresh `download_url` and a local copy
`get_credit_usage`	Credit usage over time (org admins only)

See the cartesia-mcp source for parameters and return types.

Advanced configuration

Hosted server

Default endpoint: https://mcp.cartesia.ai/mcp. The install flows above use this automatically.You can also connect from API Keys in the Playground. Org admins can opt in to admin tools when approving connect.

Local development

To run your own checkout with stdio and env vars (OUTPUT_DIRECTORY, CARTESIA_ADMIN_API_KEY, etc.), see the cartesia-mcp README.

Local audio files

Tools like speech_to_text and voice_change need paths to existing audio files on disk. Pass the full path to each file when prompting your agent.For speech_to_text, use the default batch mode for typical file transcription (mp3, flac, wav, and other common containers). Use mode="stream" for mono PCM WAV (for example TTS output) or raw PCM with encoding and sample_rate.

TTS downloads

text_to_speech persists audio to your org’s cloud storage by default (save=true) and returns:

file_id — pass to download_file to mint a fresh link or download again
download_url — when link creation succeeds, a time-limited URL (24 hours) you can open in a browser or hand to a hosted agent (Claude, ChatGPT, etc.). If omitted, call download_file with the file_id.
file_path — copy on the machine running MCP. With local uvx, use this path for speech_to_text on the same machine. On hosted MCP (mcp.cartesia.ai), paths are on the server — use download_url instead.

Set save=false for a local-only generation with no cloud fields.

API version

Cartesia MCP uses Cartesia-Version: 2026-03-01.

cartesia-mcp

The official Cartesia MCP Server

​Setup

Install in Cursor

​Try it

​Tools

​Hosted server

​Local development

​Local audio files

​TTS downloads

​API version