KEDA
Setup autoscaling with KEDA and Last9
Introduction
This document lists step-by-step instructions for setting up auto scaling with KEDA and Last9.
What is KEDA
KEDA is a Kubernetes-based Event Driven Autoscaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed. KEDA works by integrating with various event sources and metrics sources such as Last9 to dynamically adjust the number of replicas of Kubernetes deployments, StatefulSets, or any other scalable resources.
Prerequisites
Create a Last9 cluster by following Getting Started.
Keep the following information handy after creating the cluster:
$levitate_read_url
- Last9's Read endpoint$levitate_username
- Cluster ID$levitate_password
- Read token created for the cluster
KEDA Installation
Refer to the KEDA docs to install KEDA in your Kuberentes Cluster.
Check the default values.yaml
and modify as required. Below is a command to derive the default values.yaml
helm show values kedacore/keda > values.yaml
Ensure to install the following.
- KEDA Operator - Responsible for activation and deactivation of Kubernetes Deployments to scale to and from zero on no events
- KEDA Metric Server - This is a Kubernetes metrics server that exposes rich event data like queue length or stream lag to the Horizontal Pod Autoscaler to drive scale out
It is up to the Deployment to consume the events directly from the source.
This preserves rich event integration and enables gestures like completing or
abandoning queue messages to work out of the box.
The metric serving is the primary role of the
keda-operator-metrics-apiserver
container that runs when you install KEDA.
KEDA Scalers
KEDA scalers are the components within KEDA that interact with different event sources or metrics services to determine the current demand for an application and scale it accordingly. Each scaler is responsible for a specific type of event source or metric. For example, there are scalers for Prometheus Azure Service Bus, RabbitMQ, Kafka, HTTP requests, and many others.
Scalers work by polling the event source at a specified interval to retrieve metrics that indicate the current load or demand. They then use this information to scale the application in or out. The scaling parameters and the thresholds for scaling can be customized through the scaler's configuration.
Keep in handy the following parameters which are variables that act as levers to define scale strategies in KEDA.
serverAddress
- Read URL of Last9 ClustermetricName
- Name of the metric on which scaling decision will be basedthreshold
- Value to start scaling for. (This value can be a float)query
- PromQL Query to run
ScaledObject
ScaledObject
is a CRD that KEDA uses as rule sets that define scale strategies.
Sample configuration for a ScaledObject
which can be applied via kubectl
.
# scaled-object.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: template-deployment-traffic
namespace: production
spec:
scaleTargetRef:
name: rails-app
pollingInterval: 30 # Must be seconds
minReplicaCount: 2
maxReplicaCount: 3
triggers:
- type: prometheus
metadata:
serverAddress: https://<$levitate_username>:<$levitate_password>@<$levitate_read_url>
metricName: rails_requests_total
threshold: "100000"
query: sum(increase(rails_requests_total{kubernetes_namespace="production", app="web"}[4m]))
Let's drill down the trigger configuration.
type
Last9 is a Prometheus-compatible telemetry data platform so, you must use the trigger with type prometheus
in the YAML configuration to trigger scaling.
serverAddress
This is the URL of the metrics source which is the Last9 cluster's READ URL where the metrics will be read from.
metricName
Name of the metric which will be used to evaluate the trigger condition.
threshold
The threshold for value of the query when the scaling will trigger in the form of creating a new pod.
query
This query is designed to check the number of requests served by the application. The threshold is 100K. When the request count reaches 100K, a new pod is created.
Similarly, when the usage drops to 50K, a pod is deleted. To determine the value required to reduce the number of pods, KEDA waits till the result of the query becomes half of the threshold.
Applying the ScaledObject
Apply the configuration as follows.
kubectl apply -f ./scaled-object.yaml -n production --kubeconfig=$KUBECONFIG
This will setup the trigger for autoscaling the pods depending on real time request traffic from the Rails application.
You can read more about scaling triggers that KEDA supports here
Troubleshooting
Please get in touch with us on Discord or Email if you have any questions.