Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cartesia.ai/llms.txt

Use this file to discover all available pages before exploring further.

Cartesia’s inference cluster includes support for Prometheus, an open source metrics and monitoring solution. All metrics are scraped every 5 seconds via PodMonitor on port 8080 /metrics.

Prometheus Metrics

Metric NameEmitted byDescriptionNormal Range
inferno_worker_loadWorker pods# of concurrent chunks the worker is processing now< Capacity
inferno_worker_capacityWorker pods# of concurrent chunks a worker can processhardware dependent
inferno_worker_ttfaWorker pods (TTS only)Time to First Audio< 200 ms
inferno_worker_rtfWorker podsReal time factor< 1
api_queue_sizeAPI server podRequest queue size per offeringLow
api_unserviceable_requests_sizeAPI server podUnserviceable requests count0