KEDA

Setup autoscaling with KEDA and Last9

Introduction

This document lists step-by-step instructions for setting up auto scaling with KEDA and Last9.

What is KEDA

KEDA is a Kubernetes-based Event Driven Autoscaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed. KEDA works by integrating with various event sources and metrics sources such as Last9 to dynamically adjust the number of replicas of Kubernetes deployments, StatefulSets, or any other scalable resources.

Prerequisites

Create a Last9 cluster by following Getting Started.

Keep the following information handy after creating the cluster:

$levitate_read_url - Last9's Read endpoint
$levitate_username - Cluster ID
$levitate_password - Read token created for the cluster

KEDA Installation

Refer to the KEDA docs to install KEDA in your Kuberentes Cluster.

Check the default values.yaml and modify as required. Below is a command to derive the default values.yaml

helm show values kedacore/keda > values.yaml

Ensure to install the following.

KEDA Operator - Responsible for activation and deactivation of Kubernetes Deployments to scale to and from zero on no events
KEDA Metric Server - This is a Kubernetes metrics server that exposes rich event data like queue length or stream lag to the Horizontal Pod Autoscaler to drive scale out It is up to the Deployment to consume the events directly from the source. This preserves rich event integration and enables gestures like completing or abandoning queue messages to work out of the box. The metric serving is the primary role of the keda-operator-metrics-apiserver container that runs when you install KEDA.

KEDA Scalers

KEDA scalers are the components within KEDA that interact with different event sources or metrics services to determine the current demand for an application and scale it accordingly. Each scaler is responsible for a specific type of event source or metric. For example, there are scalers for Prometheus Azure Service Bus, RabbitMQ, Kafka, HTTP requests, and many others.

Scalers work by polling the event source at a specified interval to retrieve metrics that indicate the current load or demand. They then use this information to scale the application in or out. The scaling parameters and the thresholds for scaling can be customized through the scaler's configuration.

Keep in handy the following parameters which are variables that act as levers to define scale strategies in KEDA.

serverAddress - Read URL of Last9 Cluster
metricName - Name of the metric on which scaling decision will be based
threshold - Value to start scaling for. (This value can be a float)
query - PromQL Query to run

ScaledObject

ScaledObject is a CRD that KEDA uses as rule sets that define scale strategies.

Sample configuration for a ScaledObject which can be applied via kubectl.

# scaled-object.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: template-deployment-traffic
  namespace: production
spec:
  scaleTargetRef:
    name: rails-app
  pollingInterval: 30 # Must be seconds
  minReplicaCount: 2
  maxReplicaCount: 3
  triggers:
    - type: prometheus
      metadata:
        serverAddress: https://<$levitate_username>:<$levitate_password>@<$levitate_read_url>
        metricName: rails_requests_total
        threshold: "100000"
        query: sum(increase(rails_requests_total{kubernetes_namespace="production", app="web"}[4m]))

Let's drill down the trigger configuration.

type

Last9 is a Prometheus compatible time series metrics warehouse so, you must use the trigger with type prometheus in the YAML configuration to trigger scaling.

serverAddress

This is the URL of the metrics source which is the Last9 cluster's READ URL where the metrics will be read from.

metricName

Name of the metric which will be used to evaluate the trigger condition.

threshold

The threshold for value of the query when the scaling will trigger in the form of creating a new pod.

query

This query is designed to check the number of requests served by the application. The threshold is 100K. When the request count reaches 100K, a new pod is created.

Similarly, when the usage drops to 50K, a pod is deleted. To determine the value required to reduce the number of pods, KEDA waits till the result of the query becomes half of the threshold.

Applying the ScaledObject

Apply the configuration as follows.

kubectl apply -f ./scaled-object.yaml -n production --kubeconfig=$KUBECONFIG

This will setup the trigger for autoscaling the pods depending on real time request traffic from the Rails application.

tip

You can read more about scaling triggers that KEDA supports here

Troubleshooting

Please get in touch with us on Discord or Email if you have any questions.

Introduction​

What is KEDA​

Prerequisites​

KEDA Installation​

KEDA Scalers​

ScaledObject​

type​

serverAddress​

metricName​

threshold​

query​

Applying the ScaledObject​

Troubleshooting​