Prerequisites
- A Cartesia account.
- An API key.
- FFmpeg installed (optional but recommended).
Generate your first words
- cURL
- Python
- JavaScript/TypeScript
To generate your first words, run this command in your terminal, replacing You can play the resulting
YOUR_API_KEY:Make sure to replace
YOUR_API_KEY with your real API key, or the command
won’t output anything!sonic-2.wav file with afplay sonic-2.wav (on macOS) or ffplay sonic-2.wav (on any system with FFmpeg installed). You can also just double click it in your file explorer.This command calls the Text to Speech (Bytes) endpoint which runs the text-to-speech generation and transmits the output in raw bytes.The bytes endpoint supports a variety of output formats, making it perfect for batch use cases where you want to save the audio in advance.In comparison, Cartesia’s WebSocket and Server-Sent Events endpoints stream out raw PCM audio to avoid latency overhead from transcoding the audio.