Using the API

Generate your first words and learn API conventions

Quickstart

To get started with the API, the first thing you’ll need is an API key. You can get one by logging into the playground and heading to “Console” at play.cartesia.ai/console.

To generate your first words, run this command in your terminal, replacing YOUR_API_KEY:

$curl -N -X POST "https://api.cartesia.ai/tts/bytes" \
> -H "Cartesia-Version: 2024-06-10" \
> -H "X-API-Key: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{"transcript": "Welcome to Cartesia Sonic!", "model_id": "sonic-english", "voice": {"mode":"id", "id": "a0e99841-438c-4a64-b679-ae501e7d6091"}, "output_format":{"container":"raw", "encoding":"pcm_f32le", "sample_rate":44100}}' \
> | ffmpeg -f f32le -i pipe: sonic.wav

You may need to install FFMPEG.

You can play the resulting sonic.wav file with afplay sonic.wav (on macOS) or ffplay sonic.wav (on any system with FFMPEG installed).

This command calls the Text-to-Speech endpoint which runs the text-to-speech generation and transmits the output in raw bytes.

The voice used in this example can be found on the playground.

API Conventions

All endpoints use HTTPS. HTTP is not supported. API keys that call the API over HTTP may be subject to automatic rotation.

All API requests use the following base URL: https://api.cartesia.ai. (For WebSockets the corresponding protocol is wss://.)

Always send a Cartesia-Version header

Each request you send our API should have a Cartesia-Version header containing the date (YYYY-MM-DD) when you tested your integration. For WebSockets, you can alternately use the ?cartesia_version query parameter, which will take precedence.

This will help us provide you with timely deprecation notices and enable us to provide automatic backwards compatibility where possible.

For a given Cartesia-Version, we will preserve existing input and output fields, but we may make non-breaking changes, such as:

  1. Add optional request fields.
  2. Add additional response fields.
  3. Change conditions for specific error types
  4. Add variants to enum-like output values.

Our versioning scheme is inspired by the Anthropic API.

Use API keys to authenticate

Authentication is handled using API keys. You can create a new API key from play.cartesia.ai/console.

  • For HTTP requests, authentication is handled by adding the field X-API-Key: <your_api_key> into the HTTP headers.
  • For WebSocket connections, authentication is handled by passing in the field ?api_key=<your_api_key> when creating the WebSocket connection.

Check response codes

Our API uses standard HTTP response codes; refer to httpstatuses.io.

Pass data according to the method

All GET requests use query parameters to pass data. All POST requests use a JSON body or multipart/form-data.