Cartesia’s inference cluster includes support for Prometheus, an open source metrics and monitoring solution. All metrics are scraped every 5 seconds via PodMonitor on port 8080Documentation Index
Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt
Use this file to discover all available pages before exploring further.
/metrics.
Prometheus Metrics
| Metric Name | Description | Normal Range |
|---|---|---|
inferno_worker_load | # of concurrent chunks the worker is processing now | < Capacity |
inferno_worker_capacity | # of concurrent chunks a worker can process | hardware dependent |
inferno_worker_ttfa | Time to First Audio | < 200 ms |
inferno_worker_rtf | Real time factor | < 1 |
api_queue_size | Request queue size per offering | Low |
api_unserviceable_requests_size | Unserviceable requests count | 0 |