Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

This page covers moving an existing Cartesia self-hosted deployment to a newer release, and rolling back if the upgrade fails. For initial deployments, see Getting Started.

Release tags

Each Cartesia release is tagged with the format sonic-YYYYMMDD (e.g., sonic-20260503). A release tag pins the image version for every component in the chart:
  • cartesia-api
  • cartesia-license-proxy
  • Every worker image (cartesia-sonic-azure-disco, cartesia-sonic-rosy-dragon, etc.)
The Helm chart and Terraform code in cartesia-kube evolve independently from the release tag — you generally do not need to update them when upgrading the release. Set the tag once globally:
  • Helm: release.releaseTag in values.yaml
  • Terraform: release_tag in your .tfvars file

Upgrade Procedure

Bump the tag value and re-apply.
Edit your .tfvars file:
release_tag = "sonic-20260503"   # new tag
Then apply from the platform directory:
cd infra/aws/cartesia-eks   # or infra/gcp/cartesia-gke
terraform apply -var-file="../../../aws-terraform.tfvars" \
                -var "cartesia_api_key=$CARTESIA_API_KEY" \
                -var "service_account_json=$(cat /path/to/service-account.json)"
When you re-apply, Kubernetes rolls every deployment (API, license-proxy, all workers) in parallel. The order is not deterministic — expect mixed pod images during the rollout window. The chart uses release.releaseTag globally for every worker — there is no per-worker tag override. If you need per-worker staged validation before rolling all workers, contact support@cartesia.ai.

Verifying the upgrade

Run the Verify checklist from the Managed Kubernetes page. Additionally, watch each deployment’s rollout complete:
kubectl rollout status deployment/cartesia-api -n cartesia
kubectl rollout status deployment/<worker-name> -n cartesia
Confirm every pod is running the new image tag:
kubectl get pods -n cartesia -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].image}{"\n"}{end}'

Rolling Update Behavior

The chart configures these rolling update strategies:
ComponentmaxSurgemaxUnavailable
API server1release.maxUnavailable (default 0)
Workers (each)3release.maxUnavailable (default 0)
Voice Clone worker3release.maxUnavailable (default 0)
License proxyKubernetes default (25%)Kubernetes default (25%)
With the default maxUnavailable: 0, Kubernetes brings new pods up before terminating old ones — capacity is never reduced below the desired count during an upgrade. To allow faster rollouts at the cost of temporary capacity loss, raise release.maxUnavailable in values.yaml.
With maxSurge: 3 and maxUnavailable: 0, a worker deployment temporarily needs (desired replicas + 3) GPUs available during the rollout. If your cluster is at exact GPU capacity, the new pods will Pend until the cluster autoscaler adds nodes — or until you scale the deployment down before upgrading.

Rollback Procedure

If the upgrade introduces a regression, revert the release tag.
List previous releases:
helm history cartesia -n cartesia
Roll back to a specific revision:
helm rollback cartesia <revision> -n cartesia
Or to the immediately previous revision:
helm rollback cartesia 0 -n cartesia
The rollback uses the same rolling update strategy as the upgrade — capacity is preserved during the swap.

Initial deployment failures

If the cluster never reaches a healthy state during an initial deployment, debug before retrying:
  • Inspect pod events: kubectl describe pod <pod> -n cartesia
  • Inspect logs: kubectl logs <pod> -n cartesia
If the deployment must be reversed before completion:
  • Helm: helm uninstall cartesia -n cartesia
  • Terraform: revert the change in version control and terraform apply, or terraform destroy for a full teardown.

Verifying the rollback

Confirm every pod is running the rolled-back tag and re-run the Verify checklist.

Hot-reload artifacts and rollback

Voice migrations and pronunciation dictionaries added via add-voices and add-pdict are stored in the customer GCS bucket (gs://cartesia-{name}/migrations/v2/migrations/) and applied via hot reload — see Managing Artifacts. These artifacts are independent of the release tag. Rolling back a release via helm rollback or terraform apply does not remove voice migrations or pronunciation dictionaries from the bucket — they remain available after rollback. There is no public API today to revoke a previously-added voice or pdict. If you need to remove an artifact from your deployment, contact support@cartesia.ai.