Autoscaling

Pod Auto-Scaling (KEDA)

KEDA ScaledObjects use Prometheus-based metrics with two triggers:

Trigger	Metric	Threshold	Condition
Worker Load	inferno_worker_load / inferno_worker_capacity	0.8 (80%)	Always active
Queue-based	api_queue_size / capacity (overflow mode)	1.0	Only when minReplicas=0
Queue-based	api_unserviceable_requests_size	0.9	Only when minReplicas=0

Scaling behavior:

Uses the Cluster Autoscaler:

The autoscaling triggers above use Prometheus metrics exposed by the application. See the Metrics and Monitoring page for the full list of available metrics.