Make an API request
Generate your first words and learn API conventions
Prerequisites
- A Cartesia account.
- An API key.
- FFmpeg installed (optional but recommended).
FFmpeg isn’t required to use the Cartesia API, but it’s useful for saving, playing, and converting audio files, so we will use it in the examples below. You can install it using your platform’s package manager:
Generate your first words
cURL
Python
JavaScript/TypeScript
To generate your first words, run this command in your terminal, replacing YOUR_API_KEY
:
Make sure to replace YOUR_API_KEY
with your real API key, or the command won’t output anything!
You can play the resulting sonic.wav
file with afplay sonic.wav
(on macOS) or ffplay sonic.wav
(on any system with FFmpeg installed). You can also just double click it in your file explorer.
This command calls the Text to Speech (Bytes) endpoint which runs the text-to-speech generation and transmits the output in raw bytes.
The bytes endpoint supports a variety of output formats, making it perfect for batch use cases where you want to save the audio in advance.
In comparison, Cartesia’s WebSocket and Server-Sent Events endpoints stream out raw PCM audio to avoid latency overhead from transcoding the audio.
The voice used above can be found on the playground.