> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Docker

> Deploy Cartesia on bare-metal or VM nodes using Docker Compose or Docker Swarm

<Note>Docker Compose and Docker Swarm deployment are currently in **beta**. Connect with the Cartesia team for support.</Note>

Deploy Cartesia TTS on a **single machine** with Docker Compose, or across a **multi-node cluster** with Docker Swarm.

|                 | Docker Compose                                       | Docker Swarm                              |
| --------------- | ---------------------------------------------------- | ----------------------------------------- |
| **Nodes**       | Single host                                          | Multiple hosts (managers + workers)       |
| **GPU scaling** | Multiple workers via `WORKER_REPLICAS` (one per GPU) | Workers scheduled on labeled GPU nodes    |
| **MIG support** | Auto-detected via `--mig` flag                       | Per-node via node labels and `--mig` flag |
| **Networking**  | Bridge (default)                                     | Overlay (Swarm-managed)                   |

## Prerequisites

* One or more machines with Docker installed (your user must be in the `docker` group)
* **Compose only:** Docker Compose V2 (`docker compose`)
* **Swarm only:** nodes meet Docker's [Swarm networking requirements](https://docs.docker.com/engine/swarm/networking/)
* At least one NVIDIA GPU with drivers installed. MIG (Multi-Instance GPU) partitioning is supported on compatible NVIDIA GPUs
* GPU nodes have the **nvidia Docker runtime set as default** (see below)
* The `cartesia-kube` repo downloaded as described in [Downloading cartesia-kube](/self-hosted/getting-started#downloading-kube)
* A Cartesia API key file (`container_key`) and a GCS service account JSON file, provided during onboarding

### GPU runtime check

On each GPU node, verify the NVIDIA runtime:

```bash theme={null}
nvidia-smi

docker info | grep "Default Runtime"
# Expected: Default Runtime: nvidia

docker run --rm nvidia/cuda:12.3.1-base-ubuntu22.04 nvidia-smi
```

If `nvidia` is not the default runtime, install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) and run:

```bash theme={null}
sudo nvidia-ctk runtime configure --runtime=docker --set-as-default
sudo systemctl restart docker
```

**If using MIG:** After enabling MIG and creating instances on the host, verify they are visible:

```bash theme={null}
nvidia-smi -L
# Each MIG instance appears as a MIG-... UUID line beneath its parent GPU.
# The deploy script reads these UUIDs automatically — no manual configuration required.
```

<Note>MIG must be enabled and instances created on the host before deploying. Recreating MIG instances generates new UUIDs; redeploy the stack if this happens.</Note>

***

## Step 1 — Prepare secrets

Place these files on the host (Compose) or **manager node** (Swarm):

* `container_key` — file containing your Cartesia API key
* `service-account.json` — GCS service account JSON with `roles/artifactregistry.reader` (image pull) and `roles/storage.objectViewer` (GCS sync)

Make the deploy script executable:

<Tabs>
  <Tab title="Compose">
    ```bash theme={null}
    chmod +x local/scripts/deploy-compose.sh
    ```
  </Tab>

  <Tab title="Swarm">
    ```bash theme={null}
    chmod +x local/scripts/deploy-swarm.sh
    ```
  </Tab>
</Tabs>

***

## Step 2 — Initialize the cluster (Swarm only)

Skip this step if you are using Docker Compose.

On the **manager node**:

```bash theme={null}
docker swarm init --advertise-addr <MANAGER_IP>
```

Copy the `docker swarm join` command from the output. On **each additional node**, run:

```bash theme={null}
docker swarm join --token <TOKEN> <MANAGER_IP>:2377
```

Label each node from the manager. Use `docker node ls` to list node IDs:

```bash theme={null}
docker node update --label-add cpu=true <node-id>   # CPU services (API, NATS, etc.)
docker node update --label-add gpu=true <node-id>   # Standard GPU workers
```

**If using MIG:** Label MIG-enabled nodes with `mig=true` and a comma-separated list of their MIG instance UUIDs (obtained from `nvidia-smi -L` on that node). Do **not** apply `gpu=true` to MIG nodes.

```bash theme={null}
docker node update --label-add mig=true <node-id>
docker node update --label-add 'mig.uuids=MIG-<uuid1>,MIG-<uuid2>' <node-id>
```

Mixed clusters with both standard GPU nodes and MIG nodes are supported — the deploy script handles scheduling for both automatically.

***

## Step 3 — Configure environment

Set [environment variables](#configuration) before deploying. Use a `.env` file in `local/` (see `local/.env.example`) or export them in your shell.

```bash theme={null}
export IMAGE_REGISTRY="YOUR_IMAGE_REGISTRY"
export RELEASE_TAG="YOUR_RELEASE_TAG"
export MODEL_NAME="YOUR_MODEL_NAME"

export CONTAINER_KEY_FILE=/path/to/cartesia-api-key
export GCS_SA_FILE=/path/to/service-account.json

# Optional
export WORKER_REPLICAS=1
export WORKER_CAPACITY=4
export BUCKET_NAME=""
export CLUSTER_NAME="cartesia-compose"   # or "cartesia-swarm"
export USE_MIG=0                         # set to 1 to enable MIG mode (or pass --mig to the deploy script)
```

See [Configuration](#configuration) for full details on each variable.

***

## Step 4 — Deploy

<Tabs>
  <Tab title="Compose">
    From the repo root:

    ```bash theme={null}
    # Standard deployment
    ./local/scripts/deploy-compose.sh

    # With MIG support (auto-detects MIG instances via nvidia-smi)
    ./local/scripts/deploy-compose.sh --mig
    ```

    When `--mig` is used, the script auto-detects MIG instance UUIDs from `nvidia-smi`, generates a per-slice worker configuration, and scales the standard worker to zero.
  </Tab>

  <Tab title="Swarm">
    On the **manager node**:

    ```bash theme={null}
    # Standard deployment
    ./local/scripts/deploy-swarm.sh

    # With MIG support (reads UUIDs from node labels)
    ./local/scripts/deploy-swarm.sh --mig
    ```

    This will:

    1. Verify that nodes are labeled (fails with instructions if not).
    2. Create encrypted Swarm secrets from your key and service account files.
    3. Deploy all services. With `--mig`, one dedicated worker service is created per MIG instance, each pinned to its node.
  </Tab>
</Tabs>

<Warning>
  TTS workers take a few minutes to load the model into GPU memory. During this time, TTS requests will return errors even though containers appear healthy. Wait for the ready signal:

  <Tabs>
    <Tab title="Compose">
      ```bash theme={null}
      cd local && docker compose -f docker-compose.base.yaml -f docker-compose.yaml logs -f tts-worker 2>&1 | grep -i "ready"
      ```
    </Tab>

    <Tab title="Swarm">
      ```bash theme={null}
      docker service logs cartesia_tts-worker -f 2>&1 | grep -i "ready"
      ```
    </Tab>
  </Tabs>
</Warning>

***

## Step 5 — Verify

Check that services are running:

<Tabs>
  <Tab title="Compose">
    ```bash theme={null}
    cd local && docker compose -f docker-compose.base.yaml -f docker-compose.yaml ps
    ```

    If deployed with MIG, verify each worker sees exactly one MIG device:

    ```bash theme={null}
    # List all running services (MIG workers appear as tts-worker-mig-0, tts-worker-mig-1, etc.)
    cd local && docker compose -f docker-compose.base.yaml -f docker-compose.yaml -f docker-compose.mig.generated.yaml ps
    ```
  </Tab>

  <Tab title="Swarm">
    ```bash theme={null}
    docker stack services cartesia
    ```

    If deployed with MIG, verify MIG worker services are scheduled and running:

    ```bash theme={null}
    docker stack ps cartesia --filter 'name=cartesia_tts-worker-mig'
    ```
  </Tab>
</Tabs>

Test the API:

```bash theme={null}
curl http://localhost:5000/status
```

Test TTS:

```bash theme={null}
curl -s -X POST "http://localhost:5000/tts/bytes" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Cartesia-Version: 2024-06-10" \
  -d '{
    "model_id": "sonic-3.5",
    "transcript": "Hello from Cartesia.",
    "voice": {"mode": "id", "id": "00510a15-4216-4fdc-a0ab-05d74cd9f795"},
    "language": "en",
    "output_format": {"container": "mp3", "sample_rate": 44100, "bit_rate": 128000}
  }' --output test.mp3
```

***

## Troubleshooting

<Tabs>
  <Tab title="Compose">
    ```bash theme={null}
    cd local

    docker compose -f docker-compose.base.yaml -f docker-compose.yaml logs api
    docker compose -f docker-compose.base.yaml -f docker-compose.yaml logs tts-worker

    # Restart everything
    docker compose -f docker-compose.base.yaml -f docker-compose.yaml down
    docker compose -f docker-compose.base.yaml -f docker-compose.yaml up -d
    ```

    If the API exits with `no servers available for connection` (NATS not ready), restart the API after the stack is up:

    ```bash theme={null}
    cd local && docker compose -f docker-compose.base.yaml -f docker-compose.yaml up -d && docker compose -f docker-compose.base.yaml -f docker-compose.yaml restart api
    ```
  </Tab>

  <Tab title="Swarm">
    ```bash theme={null}
    docker stack ps cartesia --no-trunc

    docker service logs cartesia_api
    docker service logs cartesia_tts-worker

    # Restart the stack
    docker stack rm cartesia
    sleep 10
    cd local && docker stack deploy --with-registry-auth -c docker-compose.base.yaml -c docker-compose.swarm.yaml cartesia
    ```
  </Tab>
</Tabs>

***

## Configuration

Set these environment variables before running the deploy script. You receive `IMAGE_REGISTRY`, `RELEASE_TAG`, and `MODEL_NAME` from Cartesia during onboarding. If you mirror images into your own registry, use your mirror URL for `IMAGE_REGISTRY`.

### Required

| Variable             | Description                                                        |
| -------------------- | ------------------------------------------------------------------ |
| `IMAGE_REGISTRY`     | Container image registry URL (Cartesia registry or your mirror).   |
| `RELEASE_TAG`        | Image tag for the release you are deploying (updates per release). |
| `MODEL_NAME`         | TTS model identifier for the worker image.                         |
| `CONTAINER_KEY_FILE` | Path to file containing your Cartesia API key.                     |
| `GCS_SA_FILE`        | Path to GCS service account JSON file.                             |

### Optional

| Variable            | Default                               | Description                                                                                                                     |
| ------------------- | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `WORKER_REPLICAS`   | `1`                                   | Number of TTS worker containers. For Compose, set to your GPU count on the host. For Swarm, scale to match your GPU node count. |
| `WORKER_CAPACITY`   | `4`                                   | Max concurrent TTS requests per worker. Lower if you run out of GPU memory.                                                     |
| `BUCKET_NAME`       | *(empty)*                             | GCS bucket for migrations/LoRAs. Leave empty to disable sync.                                                                   |
| `CLUSTER_NAME`      | `cartesia-compose` / `cartesia-swarm` | Identifier for logs and metrics.                                                                                                |
| `GCS_SYNC_INTERVAL` | `300`                                 | GCS sync interval in seconds.                                                                                                   |
| `USE_MIG`           | `0`                                   | Set to `1` to enable MIG mode.                                                                                                  |
