Batch Speech-to-Text: This documentation covers OpenAI SDK compatibility for Cartesia Ink’s batched transcription endpoint.For real-time transcription, use our Streaming STT endpoint.
Endpoints
Cartesia Native:/stt - Full feature supportOpenAI Compatible:
/audio/transcriptions - Drop-in replacement for Whisper on the OpenAI SDK
Migration Guide for OpenAI SDK
Replace your OpenAI base URL withhttps://api.cartesia.ai to use the compatibility layer for Cartesia:
Parameter Support
Supported Parameters:file- The audio file to transcribemodel- Useink-whisperfor Cartesia’s latest modellanguage- Input audio language (ISO-639-1 format)timestamp_granularities- Include["word"]to get word-level timestamps
Python Example
Node.js Example
Direct API Usage
Both endpoints accept identical parameters and return the same JSON response format:Cartesia Native Endpoint
OpenAI-Compatible Endpoint
Migration from OpenAI
To migrate from OpenAI’s Whisper API to Cartesia:- Update the base URL: Change from
https://api.openai.com/v1tohttps://api.cartesia.ai - Update authentication: Replace your OpenAI API key with your Cartesia API key
- Update model names: Use
ink-whisperinstead of OpenAI’s model names - Keep the same endpoint: Continue using
/audio/transcriptions - Avoid unsupported parameters: Remove
prompt,temperature, andresponse_formatparameters - Use timestamp_granularities (Optional): Add
timestamp_granularities: ["word"]to get word-level timestamps