Skip to main content


Setup autoscaling with KEDA and Levitate


This document lists step-by-step instructions for setting up auto scaling with KEDA and Levitate.

What is KEDA

KEDA is a Kubernetes-based Event Driven Autoscaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed. KEDA works by integrating with various event sources and metrics sources such as Levitate to dynamically adjust the number of replicas of Kubernetes deployments, StatefulSets, or any other scalable resources.


Create a Levitate cluster by following Getting Started.

Keep the following information handy after creating the cluster:

  • $levitate_read_url - Levitate's Read endpoint
  • $levitate_username - Cluster ID
  • $levitate_password - Read token created for the cluster

KEDA Installation

Refer to the KEDA docs to install KEDA in your Kuberentes Cluster.

Check the default values.yaml and modify as required. Below is a command to derive the default values.yaml

helm show values kedacore/keda > values.yaml

Ensure to install the following.

  1. KEDA Operator - Responsible for activation and deactivation of Kubernetes Deployments to scale to and from zero on no events
  2. KEDA Metric Server - This is a Kubernetes metrics server that exposes rich event data like queue length or stream lag to the Horizontal Pod Autoscaler to drive scale out It is up to the Deployment to consume the events directly from the source. This preserves rich event integration and enables gestures like completing or abandoning queue messages to work out of the box. The metric serving is the primary role of the keda-operator-metrics-apiserver container that runs when you install KEDA.

KEDA Scalers

KEDA scalers are the components within KEDA that interact with different event sources or metrics services to determine the current demand for an application and scale it accordingly. Each scaler is responsible for a specific type of event source or metric. For example, there are scalers for Prometheus Azure Service Bus, RabbitMQ, Kafka, HTTP requests, and many others.

Scalers work by polling the event source at a specified interval to retrieve metrics that indicate the current load or demand. They then use this information to scale the application in or out. The scaling parameters and the thresholds for scaling can be customized through the scaler's configuration.

Keep in handy the following parameters which are variables that act as levers to define scale strategies in KEDA.

  • serverAddress - Read URL of Levitate Cluster
  • metricName - Name of the metric on which scaling decision will be based
  • threshold - Value to start scaling for. (This value can be a float)
  • query - PromQL Query to run


ScaledObject is a CRD that KEDA uses as rule sets that define scale strategies.

Sample configuration for a ScaledObject which can be applied via kubectl.

# scaled-object.yaml
kind: ScaledObject
name: template-deployment-traffic
namespace: production
name: rails-app
pollingInterval: 30 # Must be seconds
minReplicaCount: 2
maxReplicaCount: 3
- type: prometheus
serverAddress: https://<$levitate_username>:<$levitate_password>@<$levitate_read_url>
metricName: rails_requests_total
threshold: "100000"
query: sum(increase(rails_requests_total{kubernetes_namespace="production", app="web"}[4m]))

Let's drill down the trigger configuration.


Levitate is a Prometheus compatible time series metrics warehouse so, you must use the trigger with type prometheus in the YAML configuration to trigger scaling.


This is the URL of the metrics source which is the Levitate cluster's READ URL where the metrics will be read from.


Name of the metric which will be used to evaluate the trigger condition.


The threshold for value of the query when the scaling will trigger in the form of creating a new pod.


This query is designed to check the number of requests served by the application. The threshold is 100K. When the request count reaches 100K, a new pod is created.

Similarly, when the usage drops to 50K, a pod is deleted. To determine the value required to reduce the number of pods, KEDA waits till the result of the query becomes half of the threshold.

Applying the ScaledObject

Apply the configuration as follows.

kubectl apply -f ./scaled-object.yaml -n production --kubeconfig=$KUBECONFIG

This will setup the trigger for autoscaling the pods depending on real time request traffic from the Rails application.


You can read more about scaling triggers that KEDA supports here


Please get in touch with us on Discord or Email if you have any questions.