Migrating From OpenAI Whisper to Cartesia Ink
Batch Speech-to-Text: This documentation covers OpenAI SDK compatibility for Cartesia Ink’s batched transcription endpoint.
For real-time transcription, use our Streaming STT endpoint.
Cartesia’s Batch Speech-to-Text API is compatible with OpenAI’s client libraries, enabling seamless migration from OpenAI Whisper.
Endpoints
Cartesia Native: /stt
- Full feature support
OpenAI Compatible: /audio/transcriptions
- Drop-in replacement for Whisper on the OpenAI SDK
Migration Guide for OpenAI SDK
Replace your OpenAI base URL with https://api.cartesia.ai
to use the compatibility layer for Cartesia:
Parameter Support
Supported Parameters:
file
- The audio file to transcribemodel
- Useink-whisper
for Cartesia’s latest modellanguage
- Input audio language (ISO-639-1 format)timestamp_granularities
- Include["word"]
to get word-level timestamps
Response Format: Always returns JSON with transcribed text, duration, language, and optionally word timestamps.
For the complete parameter reference, see our Batch STT API documentation.
Python Example
Node.js Example
Direct API Usage
Both endpoints accept identical parameters and return the same JSON response format:
Cartesia Native Endpoint
OpenAI-Compatible Endpoint
Migration from OpenAI
To migrate from OpenAI’s Whisper API to Cartesia:
- Update the base URL: Change from
https://api.openai.com/v1
tohttps://api.cartesia.ai
- Update authentication: Replace your OpenAI API key with your Cartesia API key
- Add version header: Include
Cartesia-Version: 2025-04-16
in requests - Update model names: Use
ink-whisper
instead of OpenAI’s model names - Keep the same endpoint: Continue using
/audio/transcriptions
- Avoid unsupported parameters: Remove
prompt
,temperature
, andresponse_format
parameters - Use timestamp_granularities (Optional): Add
timestamp_granularities: ["word"]
to get word-level timestamps
The core functionality remains the same, with JSON responses containing transcribed text and optional word timestamps.