Anam + Cartesia

Last verified: 2026-06-11

Overview

Use this integration to run an Anam interactive avatar with a Cartesia Sonic TTS voice. Anam handles the avatar session, WebRTC stream, and rendering, while a Cartesia voice provides the speech output.

Prerequisites

Anam API key (ANAM_API_KEY) from the Anam Lab API keys page
Node.js 18+ for the backend token endpoint
A frontend app that can render the Anam JavaScript SDK (production guide)

This flow does not require a Cartesia API key in your app. Anam uses the saved Cartesia voice attached to the Anam voiceId.

Installation

npm install express @anam-ai/js-sdk

Set your Node project to ESM so the backend and frontend snippets work as written (import syntax):

{
  "type": "module"
}

Quick start

Create an Anam session token with a Cartesia voice, then initialize the avatar in the browser with the returned session token.

1) Backend: create an Anam session token

import express from "express";

const app = express();
app.use(express.json());

const requiredEnvVars = ["ANAM_API_KEY"];

for (const name of requiredEnvVars) {
  if (!process.env[name]) {
    throw new Error(`${name} is required`);
  }
}

// Swap these for another avatar, Cartesia voice, or LLM from Anam Lab.
const avatarId = "071b0286-4cce-4808-bee2-e642f1062de3"; // Liv
const voiceId = "c48c4dd9-5050-11f1-9076-5e955d484d11"; // Siobhan - Warm Welcomer
const llmId = "a7cf662c-2ace-4de1-a21e-ef0fbf144bb7"; // GPT OSS 120B

app.post("/api/anam-session", async (req, res) => {
  const sessionTokenResponse = await fetch("https://api.anam.ai/v1/auth/session-token", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      personaConfig: {
        name: "Cartesia voice avatar",
        avatarId,
        voiceId,
        llmId,
        systemPrompt: "You are a concise voice assistant. Keep responses brief.",
        voiceGenerationOptions: {
          speed: 1,
          volume: 1,
          emotion: "content",
        },
      },
    }),
  });

  if (!sessionTokenResponse.ok) {
    const errorBody = await sessionTokenResponse.text();
    return res.status(502).json({
      error: "Failed to create Anam session token",
      detail: errorBody,
    });
  }

  const session = await sessionTokenResponse.json();
  return res.json(session); // { sessionToken }
});

app.listen(3000);

The voiceId value is an Anam voice ID for a Cartesia voice, not a raw Cartesia provider voice ID. To find another one with the API, call the list voices endpoint and choose a voice where provider is CARTESIA. As of writing, Anam Lab lists 304 stock Cartesia voices.

curl "https://api.anam.ai/v1/voices?perPage=100&search=voice-name" \
  -H "Authorization: Bearer $ANAM_API_KEY"

You can also clone a Cartesia voice or import a Cartesia provider voice in Lab. When importing, set the provider voice ID and a default provider model such as sonic-3.5. Lab validates the voice before saving it as an Anam voice.

2) Frontend: initialize Anam with session token

Add a video element and a start button in your page HTML:

<video id="anam-video" autoplay playsinline></video>
<button id="start-avatar" type="button">Start avatar</button>

The sample below uses a browser module entry file. Wrap the call in an async function if your frontend is not a module.

import { createClient } from "@anam-ai/js-sdk";

let anamClient: ReturnType<typeof createClient> | undefined;

async function startAvatar() {
  const response = await fetch("/api/anam-session", { method: "POST" });

  if (!response.ok) {
    throw new Error("Failed to create Anam session");
  }

  const { sessionToken } = await response.json();
  anamClient = createClient(sessionToken);

  await anamClient.streamToVideoElement("anam-video");
}

document.querySelector<HTMLButtonElement>("#start-avatar")?.addEventListener("click", () => {
  startAvatar().catch((error) => {
    console.error(error);
  });
});

When the session starts, the video element shows Liv speaking with the Siobhan Cartesia voice. Try asking a short question in the browser.

Voice and performance options

Use voiceGenerationOptions on the session token request to set Cartesia voice generation controls for the session, such as speed, volume, and emotion. See Anam’s voice configuration docs for supported values.

For advanced performance control, Anam can preserve Cartesia-supported inline cues in the transcript while also using the same cue names for avatar Director Notes where enabled. For example, Cartesia documents speech control tags; Anam can use a matching laughter cue to shift the avatar toward a playful visual style.

Get Started

Text-to-Speech

Speech-to-Text

Tools

Integrations

Enterprise

Overview

Prerequisites

Installation

Quick start

1) Backend: create an Anam session token

2) Frontend: initialize Anam with session token

Voice and performance options

Resources

​Overview

​Prerequisites

​Installation

​Quick start

​1) Backend: create an Anam session token

​2) Frontend: initialize Anam with session token

​Voice and performance options

​Resources

Overview

Prerequisites

Installation

Quick start

1) Backend: create an Anam session token

2) Frontend: initialize Anam with session token

Voice and performance options

Resources