Skip to main content
Ink 2 is our fastest, most accurate streaming speech-to-text model — the lowest word error rate and best turn detection of any streaming STT, built for production voice agents. It transcribes structured data like phone numbers, dates, and emails right the first time, and knows when a speaker starts and finishes — no separate VAD required. Turn detection is built in. Ink 2 emits a full lifecycle of turn events — turn.start, turn.update, turn.eager_end, turn.resume, and turn.end — so your agent knows exactly when to listen, think, and respond. See Turn Events for the state machine, or Compare STT Endpoints to run Ink 2 with or without turn detection.

Models

ModelRelease DateLanguagesStatus
ink-2May 22, 2026enPreview
For information on ink-whisper, see our page on Older STT Models.

Where to go next

Try it out online

No sign-up or code required

Start building

Guides and best practices