Skip to main content
Cartesia’s inference cluster includes support for Prometheus, an open source metrics and monitoring solution. All metrics are scraped every 5 seconds via PodMonitor on port 8080 /metrics.

Prometheus Metrics

Metric NameEmitted byDescriptionNormal Range
inferno_worker_loadWorker pods# of concurrent chunks the worker is processing now< Capacity
inferno_worker_capacityWorker pods# of concurrent chunks a worker can processhardware dependent
inferno_worker_ttfaWorker pods (TTS only)Time to First Audio< 200 ms
inferno_worker_rtfWorker podsReal time factor< 1
api_queue_sizeAPI server podRequest queue size per offeringLow
api_unserviceable_requests_sizeAPI server podUnserviceable requests count0