Ink 2

Ink 2 is our fastest, most accurate streaming speech-to-text model — the lowest word error rate and best turn detection of any streaming STT, built for production voice agents. It transcribes structured data like phone numbers, dates, and emails right the first time, and knows when a speaker starts and finishes — no separate VAD required. Turn detection is built in. Ink 2 emits a full lifecycle of turn events — turn.start, turn.update, turn.eager_end, turn.resume, and turn.end — so your agent knows exactly when to listen, think, and respond. See Turn Events for the state machine, or Compare STT Endpoints to run Ink 2 with or without turn detection.

Model	Release Date	Languages	Status
`ink-2`	May 22, 2026	`en`	Stable

For information on ink-whisper, see our page on Older STT Models.

Where to go next

Try it out online

No sign-up or code required

Start building

Guides and best practices

Deprecated Models

Build with Ink

⌘I

Get Started

Text-to-Speech

Speech-to-Text

Tools

Integrations

Enterprise

Where to go next

Try it out online

Start building

​Where to go next

Try it out online

Start building

Where to go next