> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Migrating from Deepgram Nova with Automatic Finalization

This guide covers migrating from Deepgram Live Audio (Nova) when used without sending the `Finalize` command mid-session.

<Card horizontal title="All migration guides" icon="arrow-left-from-arc" href="/use-the-api/stt/migrations" />

This guide contains both bare API descriptions and SDK code. To install the SDK:

<CodeGroup>
  ```bash Python theme={null}
  pip install cartesia
  ```

  ```bash TypeScript theme={null}
  npm i @cartesia/cartesia-js
  ```
</CodeGroup>

<Info>
  If you're already using the Cartesia SDK, upgrade to version `>=3.2.0`
</Info>

<Note>
  Ink 2 only supports English right now.\
  We expect to add more languages in the coming months.
</Note>

## Connection

Replace the Deepgram WebSocket URL and auth header with Cartesia's `/stt/turns/websocket`.

```diff theme={null}
- wss://api.deepgram.com/v1/listen?model=nova-3&encoding=linear16&sample_rate=16000
+ wss://api.cartesia.ai/stt/turns/websocket?model=ink-2&encoding=pcm_s16le&sample_rate=16000
```

```diff theme={null}
- Authorization: Token <DEEPGRAM_API_KEY>
+ Authorization: Bearer <CARTESIA_API_KEY>
+ Cartesia-Version: 2026-03-01
```

In browsers, WebSockets do not support request headers. Instead, pass the API version as the `cartesia_version` query param and use a short-lived [access token](/get-started/authenticate-your-client-applications) using the `access_token` query param instead of an API key.

Connect to the auto-finalization WebSocket with the Cartesia SDK:

<CodeGroup>
  ```python Python (Async) theme={null}
  import os
  from cartesia import AsyncCartesia

  client = AsyncCartesia(api_key=os.getenv("CARTESIA_API_KEY"))

  async with client.stt.auto_finalize.websocket(
      model="ink-2", encoding="pcm_s16le", sample_rate=16000
  ) as connection:
      ...
  ```

  ```python Python theme={null}
  import os
  from cartesia import Cartesia

  client = Cartesia(api_key=os.getenv("CARTESIA_API_KEY"))

  with client.stt.auto_finalize.websocket(
      model="ink-2", encoding="pcm_s16le", sample_rate=16000
  ) as connection:
      ...
  ```

  ```typescript TypeScript theme={null}
  import Cartesia from "@cartesia/cartesia-js";

  const client = new Cartesia({ apiKey: process.env.CARTESIA_API_KEY });

  const connection = client.stt.autoFinalize.websocket({
    model: "ink-2",
    encoding: "pcm_s16le",
    sample_rate: 16000,
  });
  ```

  ```typescript TypeScript (Browser) theme={null}
  // Server-side: Generate access-tokens using your API key
  import Cartesia from '@cartesia/cartesia-js';

  const client = new Cartesia({ apiKey: process.env.CARTESIA_API_KEY });

  export async function GET() {
    const { token } = await client.accessToken.create({
      grants: { stt: true, tts: false, agent: false },
      // How long the token lasts in seconds
      // Allowed values: 0–3600
      expires_in: 3600,
    });
    return Response.json({ token });
  }


  // Client-side
  // 1. Fetch an access token from your server
  // 2. Connect to Cartesia via WebSocket
  import Cartesia from "@cartesia/cartesia-js";

  async function getToken(): Promise<string> {
    const res = await fetch('/replace-with-your-server');
    const { token } = await res.json();
    return token;
  }
  const audioContext = new AudioContext();

  const client = new Cartesia({ token: await getToken() });

  const connection = client.stt.autoFinalize.websocket({
    model: "ink-2",
    encoding: "pcm_f32le",
    sample_rate: audioContext.sampleRate,
  });
  ```
</CodeGroup>

## Query parameters

| Deepgram Nova                                                      | Cartesia Realtime STT (Auto)                                                | Notes                                                                                                  |
| ------------------------------------------------------------------ | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| `model=nova-3` <Badge color="red" size="sm">required</Badge>       | `model=ink-2` <Badge color="red" size="sm">required</Badge>                 | See [Models](/build-with-cartesia/stt/latest) for all options.                                         |
| `version=latest`                                                   | —                                                                           | Model version is controllable via the `model` param.                                                   |
| `encoding=linear16`                                                | `encoding=pcm_s16le` <Badge color="red" size="sm">required</Badge>          | See [encoding](#encoding) for all options.                                                             |
| `sample_rate`                                                      | `sample_rate` <Badge color="red" size="sm">required</Badge>                 | No change.                                                                                             |
| `language`                                                         | —                                                                           | `ink-2` only supports `en` right now. More languages are coming soon!                                  |
| —                                                                  | `cartesia_version=2026-03-01` <Badge color="red" size="sm">required</Badge> | See [API Conventions](/use-the-api/api-conventions#always-send-a-cartesia-version-header) for details. |
| `channels`, `multichannel`                                         | —                                                                           | Send a mono audio stream per WebSocket connection.                                                     |
| `endpointing`, `interim_results`, `utterance_end_ms`, `vad_events` | —                                                                           | Not required.                                                                                          |
| `keyterm`, `keywords`                                              | —                                                                           | Coming soon!                                                                                           |
| `mip_opt_out`                                                      | —                                                                           | Controlled by your organization.                                                                       |

<Accordion title="encoding">
  | Deepgram      | Cartesia      |
  | ------------- | ------------- |
  | `linear16`    | `pcm_s16le`   |
  | `linear32`    | `pcm_s32le`   |
  | `mulaw`       | `pcm_mulaw`   |
  | `alaw`        | `pcm_alaw`    |
  | Not supported | `pcm_f16le`   |
  | Not supported | `pcm_f32le`   |
  | `flac`        | Not supported |
  | `amr-nb`      | Not supported |
  | `amr-wb`      | Not supported |
  | `opus`        | Not supported |
  | `ogg-opus`    | Not supported |
  | `speex`       | Not supported |
  | `g729`        | Not supported |
</Accordion>

## Sending audio

Both APIs accept raw PCM audio as binary WebSocket frames in the same way.

> Cartesia does not support these encodings: `flac`, `amr-nb`, `amr-wb`, `opus`, `ogg-opus`, `speex`, `g729`

Deepgram's `Finalize` command has **no equivalent**. Ink detects turn boundaries on its own and emits a `turn.end` when the user stops speaking, so there is nothing to flush.

```diff theme={null}
- { "type": "Finalize" }
```

<Warning>
  If you currently send the `Finalize` command mid-session with Deepgram Nova, consider using Cartesia Ink with [manual finalization](/use-the-api/stt/migrate-from-deepgram-nova/manual) instead.

  Take a look at the [migration guides](/use-the-api/stt/migrations) page for details.
</Warning>

To close the session cleanly, send a JSON text frame:

```diff theme={null}
- { "type": "CloseStream" }
+ { "type": "close" }
```

Cartesia has no equivalent of Deepgram's `KeepAlive` message. The connection has a 3-minute idle timeout that resets every time you send an audio chunk — keep streaming audio (silent or otherwise) to hold it open.

<CodeGroup>
  ```python Python (Async) theme={null}
  # Equivalent to 
  # deepgram_connection.send_media(audio_chunk)
  await connection.send_raw(audio_chunk)

  # Equivalent to
  # deepgram_connection.send_close_stream()
  await connection.send({"type": "close"})
  ```

  ```python Python theme={null}
  # Equivalent to 
  # deepgram_connection.send_media(audio_chunk)
  connection.send_raw(audio_chunk)

  # Equivalent to
  # deepgram_connection.send_close_stream()
  connection.send({"type": "close"})
  ```

  ```typescript TypeScript theme={null}
  // Equivalent to deepgramConnection.sendMedia(audioChunk)
  // @param {ArrayBufferLike} audioChunk - Note: Blob is not accepted
  connection.sendRaw(audioChunk);

  // Equivalent to
  // deepgramConnection.sendCloseStream({ type: "CloseStream" })
  connection.send({ type: "close" });
  ```
</CodeGroup>

## Event mapping

Deepgram emits four server message types, mixing transcript results with separate voice-activity signals.

Cartesia folds the same information into a turn lifecycle: `turn.start`, `turn.update`, `turn.eager_end`, `turn.resume`, and `turn.end`. See [Turn Detection](/use-the-api/stt/turns) for the full state machine.

| Deepgram `type`               | Cartesia `type`  | Notes                                                                                              |
| ----------------------------- | ---------------- | -------------------------------------------------------------------------------------------------- |
| `SpeechStarted`               | `turn.start`     | The user began speaking. Carries no transcript.                                                    |
| `Results` (`is_final: false`) | `turn.update`    | Interim transcript for the utterance / turn.                                                       |
| `Results` (`is_final: true`)  | `turn.end`       | Final transcript for the utterance / turn.                                                         |
| `UtteranceEnd`                | `turn.end`       | The user stopped speaking.                                                                         |
| —                             | `turn.eager_end` | The model predicts the user might be done speaking. Okay to ignore.                                |
| —                             | `turn.resume`    | The user kept talking; ignore the last `turn.eager_end`.                                           |
| `Metadata`                    | —                | No equivalent.                                                                                     |
| —                             | `connected`      | Fires once when the WebSocket is established. You do not need to wait for it before sending audio. |
| —                             | `error`          | Client or server errors.                                                                           |

### The Deepgram `Results` message

Deepgram's `Results` carry a lot of information. You can extract similar information from Cartesia's `turn.update` and `turn.end` events.

| Deepgram `Results`   | Cartesia      | Notes                                                                                   |
| -------------------- | ------------- | --------------------------------------------------------------------------------------- |
| `is_final: false`    | `turn.update` | Interim transcript for the utterance / turn. Cumulative since the last final transcript |
| `is_final: true`     | `turn.end`    | Final transcript for the utterance / turn.                                              |
| `speech_final: true` | `turn.end`    | The user stopped speaking.                                                              |

<Info>
  Deepgram sends `UtteranceEnd` and `speech_final: true` separately, but they have the same semantic meaning: "the user has finished speaking".

  Cartesia simplifies this into a single high-accuracy signal: `turn.end`.
</Info>

A Deepgram `Results` message:

```json theme={null}
{
  "type": "Results",
  "is_final": true,
  "speech_final": true,
  "channel": {
    "alternatives": [
      {
        "transcript": "Hello world!",
        "confidence": 0.99,
        "words": [ ... ]
      }
    ]
  },
  "metadata": { ... }
}
```

Becomes a Cartesia `turn.update` / `turn.end` event:

```json theme={null}
{
  "type": "turn.end",
  "transcript": "Hello world!",
  "request_id": "33cacee6-1936-4949-a05b-ecc9f2393248"
}
```

<Info>`turn.start` and `turn.resume` events do not carry a transcript.</Info>

<CodeGroup>
  ```python Python (Async) theme={null}
  import asyncio
  from cartesia.types.stt import STTAutoFinalizeWebsocketResponse

  final_transcript = ""

  def on_message(message: STTAutoFinalizeWebsocketResponse) -> None:
      global final_transcript
      if message.type == "turn.start":
          print("User started speaking")
      elif message.type == "turn.update":
          print("Transcript so far: " + final_transcript + message.transcript)
      elif message.type == "turn.end":
          print("User stopped speaking")
          # Do not strip or add spaces!
          final_transcript += message.transcript
      elif message.type == "error":
          print(f"Error: {message.message}")

  # Equivalent to
  # deepgram_connection.on(EventType.MESSAGE, on_message)
  connection.on("event", on_message)

  # Equivalent to
  # asyncio.create_task(deepgram_connection.start_listening())
  recv_task = asyncio.create_task(connection.dispatch_events())
  ```

  ```python Python theme={null}
  import threading
  from cartesia.types.stt import STTAutoFinalizeWebsocketResponse

  final_transcript = ""

  def on_message(message: STTAutoFinalizeWebsocketResponse) -> None:
      global final_transcript
      if message.type == "turn.start":
          print("User started speaking")
      elif message.type == "turn.update":
          print("Transcript so far: " + final_transcript + message.transcript)
      elif message.type == "turn.end":
          print("User stopped speaking")
          # Do not strip or add spaces!
          final_transcript += message.transcript
      elif message.type == "error":
          print(f"Error: {message.message}")

  # Equivalent to
  # deepgram_connection.on(EventType.MESSAGE, on_message)
  connection.on("event", on_message)

  # Equivalent to
  # threading.Thread(target=deepgram_connection.start_listening, daemon=True).start()
  threading.Thread(target=connection.dispatch_events, daemon=True).start()
  ```

  ```typescript TypeScript theme={null}
  import Cartesia from '@cartesia/cartesia-js';

  let finalTranscript = '';

  // Equivalent to
  // deepgramConnection.on("message", (message) => { ... });
  connection.on("event", (message: Cartesia.STT.AutoFinalize.STTAutoFinalizeWebsocketResponse) => {
    switch (message.type) {
      case "turn.start":
        console.log("User started speaking");
        break;
      case "turn.update":
        console.log("Transcript so far: " + finalTranscript + message.transcript);
        break;
      case "turn.end":
        console.log("User stopped speaking");
        // Do not trim or add spaces!
        finalTranscript += message.transcript
        break;
    }
  });

  // Equivalent to
  // deepgramConnection.on("error", (error) => { ... });
  connection.on("error", (error) => {
    if (error.error) {
      // Server sent error (may be a bad request or internal server error)
      console.error(`Server sent an error: ${error.error.message}`);
    } else {
      // Client error
      console.error(`Client had an error: ${error.message}`);
    }
  });
  ```
</CodeGroup>

## Example Server Messages

> Hello! Nova's transcripts are joined with spaces. Ink's are not.

| Deepgram Nova                                                                                        | Cartesia Realtime STT (Auto)                                                                    |
| ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------- |
| <Badge>SpeechStarted</Badge>                                                                         | <Badge>turn.start</Badge>                                                                       |
| <Badge color="orange">is\_final: false</Badge> `"Hello!"`                                            | <Badge color="orange">turn.update</Badge> `"Hello!"`                                            |
| —                                                                                                    | <Badge>turn.eager\_end</Badge> `"Hello!"`                                                       |
| —                                                                                                    | <Badge>turn.resume</Badge>                                                                      |
| <Badge color="orange">is\_final: false</Badge> `"Hello! Nova's transcripts are joined with spaces."` | <Badge color="orange">turn.update</Badge> `"Hello! Nova's transcripts are joined with spaces."` |
| —                                                                                                    | <Badge>turn.eager\_end</Badge> `"Hello! Nova's transcripts are joined with spaces."`            |
| <Badge color="green">is\_final: true</Badge> `"Hello! Nova's transcripts are joined with spaces."`   | <Badge color="green">turn.end</Badge> `"Hello! Nova's transcripts are joined with spaces."`     |
| <Badge>UtteranceEnd</Badge>                                                                          | —                                                                                               |
| <Badge>SpeechStarted</Badge>                                                                         | <Badge>turn.start</Badge>                                                                       |
| <Badge color="orange">is\_final: false</Badge> `"Ink's are not."`                                    | <Badge color="orange">turn.update</Badge> `" Ink's are not."`                                   |
| —                                                                                                    | <Badge>turn.eager\_end</Badge> `" Ink's are not."`                                              |
| <Badge color="green">is\_final: true</Badge> `"Ink's are not."`                                      | <Badge color="green">turn.end</Badge> `" Ink's are not."`                                       |
| <Badge>UtteranceEnd</Badge>                                                                          | —                                                                                               |

## References

<CardGroup cols={2}>
  <Card icon="code" title="API Reference" href="/api-reference/stt/turns/websocket">
    Cartesia Realtime STT (Auto)
  </Card>

  <Card icon="brackets-curly" title="Full Code Example" href="/examples/stt-auto-finalize-websocket">
    Using the Cartesia SDK
  </Card>
</CardGroup>
