Skip to main content
The cartesia-mcp package exposes Cartesia through the Model Context Protocol (MCP) so MCP-capable clients—Cursor, Claude Code, Codex, and similar—can list voices, run TTS and STT, manage pronunciation dictionaries, clone voices, and more without custom scripts.

Requirements

  • uv — runs the server via uvx with no global install
  • Python 3.13+ (installed automatically by uvx)
  • A Cartesia API key (format sk_car_…)

Setup

Get an API key, then connect cartesia-mcp to your agent.

Try it

Ask your agent things like:
  • List all available Cartesia voices
  • Convert text to audio with a chosen voice (speed, volume, emotion)
  • Transcribe an audio file to text
  • Create a pronunciation dictionary and use it in TTS
  • Check credit usage for your account
  • Localize an existing voice into another language
  • Change an audio file to use a different voice

Tools

ToolDescription
text_to_speechConvert text to audio; optional speed, volume, emotion, and pronunciation dict
speech_to_textStream-transcribe an audio file via STT WebSocket (ink-2)
list_voicesList available voices (filter by language, search, gender, etc.)
get_voiceFetch metadata for a voice by ID
clone_voiceClone a voice from an audio sample
update_voiceUpdate a cloned voice’s name or description
delete_voiceDelete a cloned voice
voice_changeRe-render audio with a different voice
localize_voiceAdapt a voice to another language or dialect
list_pronunciation_dictsList pronunciation dictionaries
create_pronunciation_dictCreate a pronunciation dictionary
get_pronunciation_dictGet a pronunciation dictionary by ID
update_pronunciation_dictUpdate a pronunciation dictionary
delete_pronunciation_dictDelete a pronunciation dictionary
get_credit_usageCredit usage over time (admin API key)
See the cartesia-mcp source for parameters and return types.

Output directory

By default, generated audio is written to the server’s working directory. To choose a fixed folder, add OUTPUT_DIRECTORY to env:
"env": {
  "CARTESIA_API_KEY": "YOUR_API_KEY",
  "OUTPUT_DIRECTORY": "~/cartesia-output"
}

Local audio files

Tools like speech_to_text and voice_change need paths to existing audio files on disk. Pass the full path to each file when prompting your agent. For speech_to_text, use a mono PCM WAV file (or raw PCM with encoding and sample_rate).

Admin API key

Some tools call management endpoints that accept admin API keys only (sk_car_admin_...). To use get_credit_usage, set CARTESIA_ADMIN_API_KEY in env in addition to CARTESIA_API_KEY. Admin keys work only on management routes; API keys from play.cartesia.ai/keys do not work on those routes, and admin keys do not work on generation routes.Mint admin keys in the Playground under Keys → Admin (organization admins only).
"env": {
  "CARTESIA_API_KEY": "YOUR_API_KEY",
  "CARTESIA_ADMIN_API_KEY": "YOUR_ADMIN_API_KEY"
}

API version

Cartesia MCP is built using Cartesia-Version: 2026-03-01.

cartesia-mcp

The official Cartesia MCP Server