Cartesia’s self-hosted services support a configurable trade-off between latency and throughput for both TTS and STT deployments.Documentation Index
Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
Use this file to discover all available pages before exploring further.

Core Components
API Server
The API Server is the entrypoint for all requests for your self-hosted Cartesia Service. It handles incoming REST API requests and WebSocket connections.PubSub Controller (NATS)
We leverage an async communication protocol between the API server and the model containers to manage smooth low latency request handling. This design allows :- Model containers to leave and join the cluster freely.
- Efficient stateful management of long running request lifecycles.
- Coordination between the API server and Model containers for the lowest latency pathways for a request.
Model Workers (Engine)
Cartesia provides batched engine workers for both TTS and STT. The core parameter to customize here is thebatch_size (B). We’ll discuss tradeoffs
for this and other parameters in the Performance Tuning sections.
License Proxy Server
We deploy a single service which talks to our cloud environment for authenticating and ensuring license validity of the self-hosted deployment. We do this for several reasons, primarily: this becomes the only service making outbound calls, thus making it easier to configure network security policies. Proxy allows you to choose the level of isolation you want:Connected: The deployment validates licensing by pinging our cloud periodically and sends telemetry regarding usage.Air-gapped: Completely isolated offering, where you work with an offline license. In air-gapped mode, we work with you directly to get usage information via audit-logs.
Connected mode, however if you have need for completely isolated deployments,
our GTM team can work with you in setting things up.
For both Connected and Air-gapped mode, we have grace periods configured, so we don’t immediately terminate the operations on getting disconnected or license expiring.