STT
Ink is a new family of streaming speech-to-text (STT) models for developers building real-time voice applications.
- ● the latest stable snapshot of the model
To use the stable version of the model, we recommend using the base model name (e.g. ink-whisper
).
In many cases the stable and preview snapshots are the same, but in some cases the preview snapshot may have additional features or improvements.
ink-whisper
Ink Whisper is the fastest, most affordable speech-to-text model — engineered for enterprise deployment in production-grade voice agents. It delivers higher accuracy than baseline Whisper and is optimized for real-time performance in a wide variety of real-world conditions.
Additional Capabilities:
- Handles variable-length audio chunks and interruptions gracefully using dynamic chunking.
- Reliably transcribes speech with background noise.
- Accurately transcribes audio with telephony artifacts, accents, and disfluencies.
- Excels at transcribing proper nouns and domain-specific terminology.
To learn how to use the Ink STT family, see the Speech-to-Text API Reference.
Selecting a Model
When making API calls, you can specify either:
Continuous updates
All models have a base model name (e.g. ink-whisper
).
We recommend using these for prototyping and development, then switching to a date-versioned model for production use cases to ensure stability.
Future Updates
New snapshots are released periodically with improvements in performance, additional language support, and new capabilities. Check back regularly for updates.