> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Migrating from ElevenLabs with Manual Commits

This guide covers migrating from ElevenLabs Realtime Speech to Text when used with `commit_strategy=manual`.

<Card horizontal title="All migration guides" icon="arrow-left-from-arc" href="/use-the-api/stt/migrations" />

This guide contains both bare API descriptions and SDK code. To install the SDK:

<CodeGroup>
  ```bash Python theme={null}
  pip install cartesia
  ```

  ```bash TypeScript theme={null}
  npm i @cartesia/cartesia-js
  ```
</CodeGroup>

<Info>
  If you're already using the Cartesia SDK, upgrade to version `>=3.2.0`
</Info>

## Connection

Replace the ElevenLabs WebSocket URL and auth header with Cartesia's `/stt/websocket`.

```diff theme={null}
- wss://api.elevenlabs.io/v1/speech-to-text/realtime?model_id=scribe_v2_realtime&audio_format=pcm_16000&commit_strategy=manual
+ wss://api.cartesia.ai/stt/websocket?model=ink-2&encoding=pcm_s16le&sample_rate=16000
```

```diff theme={null}
- xi-api-key: <ELEVENLABS_API_KEY>
+ x-api-key: <CARTESIA_API_KEY>
+ cartesia-version: 2026-03-01
```

In browsers, WebSockets do not support request headers. Instead, pass the API version as the `cartesia_version` query param and use a short-lived [access token](/get-started/authenticate-your-client-applications) using the `access_token` query param instead of an API key.

Connect to the manual-finalization WebSocket with the Cartesia SDK:

<CodeGroup>
  ```python Python theme={null}
  import os
  from cartesia import AsyncCartesia

  client = AsyncCartesia(api_key=os.getenv("CARTESIA_API_KEY"))

  async with client.stt.manual_finalize.websocket(
      model="ink-2", encoding="pcm_s16le", sample_rate=16000
  ) as connection:
      ...
  ```

  ```typescript TypeScript theme={null}
  import Cartesia from "@cartesia/cartesia-js";

  const client = new Cartesia({ apiKey: process.env.CARTESIA_API_KEY });

  const connection = client.stt.manualFinalize.websocket({
    model: "ink-2",
    encoding: "pcm_s16le",
    sample_rate: 16000,
  });
  ```

  ```typescript TypeScript (Browser) theme={null}
  // Server-side: Generate access-tokens using your API key
  import Cartesia from '@cartesia/cartesia-js';

  const client = new Cartesia({ apiKey: process.env.CARTESIA_API_KEY });

  export async function GET() {
    const { token } = await client.accessToken.create({
      grants: { stt: true, tts: false, agent: false },
      // How long the token lasts in seconds
      // Allowed values: 0–3600
      expires_in: 3600,
    });
    return Response.json({ token });
  }


  // Client-side
  // 1. Fetch an access token from your server
  // 2. Connect to Cartesia via WebSocket
  import Cartesia from "@cartesia/cartesia-js";

  async function getToken(): Promise<string> {
    const res = await fetch('/replace-with-your-server');
    const { token } = await res.json();
    return token;
  }
  const audioContext = new AudioContext();

  const client = new Cartesia({ token: await getToken() });

  const connection = client.stt.manualFinalize.websocket({
    model: "ink-2",
    encoding: "pcm_f32le",
    sample_rate: audioContext.sampleRate,
  });
  ```
</CodeGroup>

## Query parameters

| ElevenLabs Scribe (manual)                                                  | Cartesia Realtime STT (Manual)                                                           | Notes                                                                                                                             |
| --------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `model_id=scribe_v2_realtime` <Badge color="red" size="sm">required</Badge> | `model=ink-2` <Badge color="red" size="sm">required</Badge>                              | See [Models](/build-with-cartesia/stt/latest) for all options.                                                                    |
| `audio_format=pcm_16000`                                                    | `encoding=pcm_s16le` + `sample_rate=16000` <Badge color="red" size="sm">required</Badge> | ElevenLabs bundles format and rate; Cartesia splits them. See [encoding](#encoding).                                              |
| `commit_strategy=manual`                                                    | —                                                                                        | See [auto finalization](/use-the-api/stt/migrate-from-elevenlabs-realtime-speech-to-text/auto) for automatic commits.             |
| `language_code`                                                             | `language`                                                                               | `ink-2` only supports `en` right now. Use `ink-whisper` for [other languages](/build-with-cartesia/stt/older-models#ink-whisper). |
| —                                                                           | `cartesia_version=2026-03-01` <Badge color="red" size="sm">required</Badge>              | See [API Conventions](/use-the-api/api-conventions#always-send-a-cartesia-version-header) for details.                            |
| `include_timestamps`                                                        | —                                                                                        | Always included if supported by the model (`ink-whisper` only for now).                                                           |
| `keyterms`                                                                  | —                                                                                        | Coming soon!                                                                                                                      |
| `enable_logging`                                                            | —                                                                                        | Controlled by your organization.                                                                                                  |

<Accordion title="encoding">
  ElevenLabs bundles the sample format and rate into a single `audio_format` token. Cartesia splits them into `encoding` and `sample_rate`.

  | ElevenLabs `audio_format` | Cartesia `encoding` | Cartesia `sample_rate` |
  | ------------------------- | ------------------- | ---------------------- |
  | `pcm_8000`                | `pcm_s16le`         | `8000`                 |
  | `pcm_16000`               | `pcm_s16le`         | `16000`                |
  | `pcm_22050`               | `pcm_s16le`         | `22050`                |
  | `pcm_24000`               | `pcm_s16le`         | `24000`                |
  | `pcm_44100`               | `pcm_s16le`         | `44100`                |
  | `pcm_48000`               | `pcm_s16le`         | `48000`                |
  | `ulaw_8000`               | `pcm_mulaw`         | `8000`                 |

  Cartesia also accepts `pcm_s32le`, `pcm_f16le`, `pcm_f32le`, and `pcm_alaw`.\
  All Cartesia encodings support all sample rates.
</Accordion>

## Sending audio

ElevenLabs wraps each audio chunk in a JSON formatted text frame and base64-encodes the audio bytes.\
Cartesia accepts audio chunks as binary frames: send the raw audio bytes directly:

```diff theme={null}
- { "message_type": "input_audio_chunk", "audio_base_64": "<base64 PCM>", "commit": false, "sample_rate": 16000 }
+ <raw PCM bytes>
```

> * No need to supply previous text
> * Sample rate is determined upon connection by the `sample_rate` query parameter

Cartesia's control commands are bare text frames, not JSON.

To commit buffered audio and emit a transcript without ending the session, send a `finalize` frame:

```text theme={null}
finalize
```

| ElevenLabs                                                                        | Cartesia                                                                                                                                        |
| --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| Streams `partial_transcript` events, then `committed_transcript` once you commit. | **Streams final transcript deltas continuously** as audio arrives. `finalize` flushes text that might otherwise be held back for a few seconds. |

<Warning>
  It is important to send the `finalize` command at the right times in the audio stream.

  Consider using [auto finalization](./auto) if you don't know when your user is done speaking.
</Warning>

To commit all remaining audio and close the session, send a `close` frame:

```text theme={null}
close
```

Cartesia will transcribe all buffered audio, then close the socket for you.

### Sending audio with the SDK

<CodeGroup>
  ```python Python theme={null}
  # ElevenLabs
  await elevenlabs_connection.send({
    "audio_base_64": b64encode(raw_audio),
  })

  # Cartesia
  # raw_audio (bytes) - Raw audio data, about 100 ms at a time
  await connection.send_raw(raw_audio)
  ```

  ```typescript TypeScript theme={null}
  // ElevenLabs
  elevenLabsConnection.send({ audioBase64: rawAudio.toBase64() })

  // Cartesia
  // @param {ArrayBufferLike} rawAudio - raw audio data, about 100 ms at a time
  connection.sendRaw(rawAudio);
  ```
</CodeGroup>

### Decoding base64 encoded audio before sending

<CodeGroup>
  ```python Python theme={null}
  # ElevenLabs
  await elevenlabs_connection.send({
    "audio_base_64": audio_base_64,
  })

  # Cartesia
  from base64 import b64decode
  await connection.send_raw(b64decode(audio_base_64))
  ```

  ```typescript TypeScript theme={null}
  // ElevenLabs
  elevenLabsConnection.send({ audioBase64 });

  // Cartesia
  connection.sendRaw(Uint8Array.fromBase64(audioBase64));
  ```
</CodeGroup>

### Committing and closing

<CodeGroup>
  ```python Python theme={null}
  # ElevenLabs
  await elevenlabs_connection.commit()

  # Cartesia
  await connection.send("finalize")

  # ElevenLabs: close the socket immediately
  await elevenlabs_connection.close()

  # Cartesia: commit remaining audio
  # and let the server close the socket once done
  await connection.send("close")

  # Cartesia: Close the socket early (optional)
  await connection.close()
  ```

  ```typescript TypeScript theme={null}
  // ElevenLabs
  elevenLabsConnection.commit();

  // Cartesia
  connection.send("finalize");

  // ElevenLabs: close the socket immediately
  elevenLabsConnection.close();

  // Cartesia: commit remaining audio
  // and let the server close the socket once done
  connection.send("close");

  // Cartesia: Close the socket early (optional)
  connection.close();
  ```
</CodeGroup>

## Event mapping

Scribe emits interim `partial_transcript` events, then a `committed_transcript` when you commit.\
Cartesia emits `transcript` deltas plus acknowledgments for the `finalize` and `close` commands.

| ElevenLabs `message_type`              | Cartesia `type`                                | Notes                                                                                                                                                                                     |
| -------------------------------------- | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `partial_transcript`                   | `transcript` (`is_final: false`)               | Never sent by Ink 2 or Whisper (reserved for future models).                                                                                                                              |
| `committed_transcript`                 | `transcript` (`is_final: true`) + `flush_done` | ElevenLabs sends committed transcripts in one message; Cartesia sends `transcript` messages containing deltas, then a `flush_done` message once all deltas have been sent for the segment |
| `committed_transcript_with_timestamps` | `transcript` (`is_final: true`) + `flush_done` | Only `ink-whisper` supports timestamps right now.                                                                                                                                         |
| —                                      | `done`                                         | Sent after all audio until `close` has been transcribed, immediately before the WebSocket closes.                                                                                         |
| `error`                                | `error`                                        | Client or server errors.                                                                                                                                                                  |
| `auth_error`                           | —                                              | Cartesia will reject the WebSocket upgrade with a 401 or 403 HTTP status.                                                                                                                 |
| `quota_exceeded`                       | `error`                                        | Cartesia's error response will contain `"error_code": "quota_exceeded"`.                                                                                                                  |
| `rate_limited`                         | `error`                                        | Cartesia's error response will contain `"error_code": "concurrency_limited"`.                                                                                                             |
| `session_time_limit_exceeded`          | —                                              | Cartesia will send a WebSocket close frame with code `1001`.                                                                                                                              |

### Committed transcripts

A Scribe `committed_transcript` carries the **full text** of the segment since the last commit:

```json theme={null}
{
  "message_type": "committed_transcript",
  "text": "Hello world! This is the full transcript."
}
```

Becomes one or more Cartesia `transcript` events, each carrying a **delta**:

```json theme={null}
{
  "type": "transcript",
  "is_final": true,
  "text": "Hello world!",
  "duration": 0.5,
  "words": [
    {
      "word": "Hello",
      "start": 0,
      "end": 0.2
    },
    {
      "word": " world!",
      "start": 0.2,
      "end": 0.5
    }
  ],
  "request_id": "2ff8af53-4d38-479d-8287-58940f01c701"
}
```

> * Ink 2 does not return `duration` or `words` yet
> * Ink 2 and Whisper currently only emit final transcripts (`is_final: true`)

Followed by a Cartesia `flush_done` event:

```json theme={null}
{
  "type": "flush_done",
  "is_final": false,
  "request_id": "2ff8af53-4d38-479d-8287-58940f01c701"
}
```

> Ignore the `is_final` property on `flush_done` and `done` events

<Tip>Cartesia's final transcripts are **deltas**; concatenate them without stripping or add whitespace.</Tip>

<CodeGroup>
  ```python Python theme={null}
  import asyncio
  from cartesia.types.stt import STTManualFinalizeWebsocketResponse

  partial_transcript = ""

  def on_message(message: STTManualFinalizeWebsocketResponse) -> None:
      global partial_transcript
      if message.type == "transcript" and message.is_final:
          # Do not strip or add whitespace!
          partial_transcript += message.text
          print(f"partial_transcript: {partial_transcript}")
      elif message.type == "flush_done" or message.type == "done":
          print(f"committed_transcript: {partial_transcript}")
          partial_transcript = ""
      elif message.type == "error":
          error_code = message.error_code or "unknown_error"
          if error_code == "quota_exceeded":
              print("You are out of credits")
          elif error_code == "concurrency_limited":
              print("You have too many open STT connections")
          else:
              print(f"{error_code}: {message.message}")

  connection.on("event", on_message)

  # ElevenLabs dispatches to your callbacks automatically;
  # with Cartesia you run the dispatch loop yourself
  recv_task = asyncio.create_task(connection.dispatch_events())
  ```

  ```typescript TypeScript theme={null}
  import Cartesia from '@cartesia/cartesia-js';

  let partialTranscript = '';

  connection.on("event", (message: Cartesia.STT.ManualFinalize.STTManualFinalizeWebsocketResponse) => {
    switch (message.type) {
      case "transcript":
        if (message.is_final) {
          // Do not trim or add whitespace!
          partialTranscript += message.text;
          console.log(`partial_transcript: ${partialTranscript}`);
        }
        break;
      case "flush_done":
      case "done":
        console.log(`committed_transcript: ${partialTranscript}`);
        partialTranscript = '';
        break;
    }
  });

  connection.on("error", (error) => {
    if (error.error) {
      // Server sent error (may be a bad request or internal server error)
      const errorCode = error.error.error_code || "unknown_error";
      switch (errorCode) {
        case "quota_exceeded":
          console.error("You are out of credits");
          break;
        case "concurrency_limited":
          console.error("You have too many open STT connections");
          break;
        default:
          console.error(`${errorCode}: ${error.error.message}`);
          break;
      }
    } else {
      // Client error
      console.error(`Client had an error: ${error.message}`);
    }
  });

  connection.on("close", (code: number, reason: string) => {
    if (code === 1001) {
      console.log("WebSocket closed due to inactivity");
    } else {
      console.log(`WebSocket closed (${code}): ${reason}`);
    }
  });
  ```
</CodeGroup>

## Example Server Messages

> Scribe sends full transcripts. Ink sends deltas and may break words.

| ElevenLabs Scribe (manual)                                                                                     | Cartesia Realtime STT (Manual)                                             |
| -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| <Badge color="orange">partial\_transcript</Badge> `"Scribe sends"`                                             | <Badge color="green">is\_final: true</Badge> `"Scribe sends"`              |
| <Badge color="orange">partial\_transcript</Badge> `"Scribe sends full transcripts."`                           | <Badge color="green">is\_final: true</Badge> `" full transc"`              |
| <Badge>commit</Badge> *(client)*                                                                               | <Badge>finalize</Badge> *(client)*                                         |
| <Badge color="green">committed\_transcript</Badge> `"Scribe sends full transcripts."`                          | <Badge color="green">is\_final: true</Badge> `"ripts."`                    |
| <Badge color="green">committed\_transcript\_with\_timestamps</Badge> `"Scribe sends full transcripts."`        | <Badge>flush\_done</Badge>                                                 |
| <Badge color="orange">partial\_transcript</Badge> `"Ink sends deltas"`                                         | <Badge color="green">is\_final: true</Badge> `" Ink sends"`                |
| <Badge color="orange">partial\_transcript</Badge> `"Ink sends deltas and may break words."`                    | <Badge color="green">is\_final: true</Badge> `" deltas and may break wor"` |
| <Badge>commit</Badge> *(client)*                                                                               | <Badge>finalize</Badge> *(client)*                                         |
| <Badge color="green">committed\_transcript</Badge> `"Ink sends deltas and may break words."`                   | <Badge color="green">is\_final: true</Badge> `"ds."`                       |
| <Badge color="green">committed\_transcript\_with\_timestamps</Badge> `"Ink sends deltas and may break words."` | <Badge>flush\_done</Badge>                                                 |

## References

<CardGroup cols={2}>
  <Card icon="code" title="API Reference" href="/api-reference/stt/websocket">
    Cartesia Realtime STT (Manual)
  </Card>

  <Card icon="brackets-curly" title="Full Code Example" href="/examples/stt-manual-finalize-websocket">
    Using the Cartesia SDK
  </Card>
</CardGroup>
