/metrics.
Prometheus Metrics
| Metric Name | Emitted by | Description | Normal Range |
|---|---|---|---|
inferno_worker_load | Worker pods | # of concurrent chunks the worker is processing now | < Capacity |
inferno_worker_capacity | Worker pods | # of concurrent chunks a worker can process | hardware dependent |
inferno_worker_ttfa | Worker pods (TTS only) | Time to First Audio | < 200 ms |
inferno_worker_rtf | Worker pods | Real time factor | < 1 |
api_queue_size | API server pod | Request queue size per offering | Low |
api_unserviceable_requests_size | API server pod | Unserviceable requests count | 0 |