Skip to content

KEDA (Kubernetes Event Driven Autoscaling)

KEDA (Kubernetes-based Event Driven Autoscaler) extends HPA features to enable resource scaling based on events both within and external to the cluster.

ScaledObjects

KEDA uses a custom resource definition known as a ScaledObject resource. ScaledObjects can have one or more triggers and reference a variety of Kubernetes resource types, such as deployments, stateful sets, and even other custom resources.

Each ScaledObject is comprised of two essential components; the scaleTargetRef to define what resource is to be scaled, and the triggers to define what events on which we would like the resource to scale.

scaleTargetRef

Four keys are provided in the content for the scaleTargetRef. The only required key here is the name of the resource you wish to scale. This resource must be contained in the same namespace to which you deploy the ScaledObject.

The apiVersion, kind, and envSourceContainerName are all optional, unless you want to scale a resource other than a common Kubernetes deployment.

Example

spec:
  scaleTargetRef:
    apiVersion:    {api-version-of-target-resource}  # Optional. Default: apps/v1
    kind:          {kind-of-target-resource}         # Optional. Default: Deployment
    name:          {name-of-target-resource}         # Mandatory. Must be in the same namespace as the ScaledObject
    envSourceContainerName: {container-name}         # Optional. Default: .spec.template.spec.containers[0]

triggers

The triggers for a ScaledObject can contain common Kubernetes metrics such as cpu and memory resource consumption. While helpful the true benefit of a ScaledObject is the ability to query external metrics and events for scaling your resources. KEDA provides scalers from a wide variety of services and most will have their own parameters. It is recommended to view a list of currently available scalers as new scalers are added frequently with sequential updates.

Optional Parameters

Other optional parameters are available to better tune the scale events to meet the requirements of your application. For more detail on these parameters, see the ScaledObject Spec documentation.

Example

spec:
  scaleTargetRef:
    apiVersion:    {api-version-of-target-resource}  # Optional. Default: apps/v1
    kind:          {kind-of-target-resource}         # Optional. Default: Deployment
    name:          {name-of-target-resource}         # Mandatory. Must be in the same namespace as the ScaledObject
    envSourceContainerName: {container-name}         # Optional. Default: .spec.template.spec.containers[0]
  pollingInterval:  30                               # Optional. Default: 30 seconds
  cooldownPeriod:   300                              # Optional. Default: 300 seconds
  idleReplicaCount: 0                                # Optional. Default: ignored, must be less than minReplicaCount 
  minReplicaCount:  1                                # Optional. Default: 0
  maxReplicaCount:  100                              # Optional. Default: 100
  fallback:                                          # Optional. Section to specify fallback options
    failureThreshold: 3                              # Mandatory if fallback section is included
    replicas: 6                                      # Mandatory if fallback section is included
  advanced:                                          # Optional. Section to specify advanced options
    restoreToOriginalReplicaCount: true/false        # Optional. Default: false
    horizontalPodAutoscalerConfig:                   # Optional. Section to specify HPA related options
      name: {name-of-hpa-resource}                   # Optional. Default: keda-hpa-{scaled-object-name}
      behavior:                                      # Optional. Use to modify HPA's scaling behavior
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
  triggers:
  ...

Examples

AWS CloudWatch Triggers for SQS queue length

To give an basic example of triggering scale events, we'll use CloudWatch Metrics to query queue length for a given SQS queue. In the example below, KEDA will trigger a scale event if the ApproximateNumberOfMessagesVisible holds above the targetMetricValue for longer than the metricStatPeriod and metricCollectionTime.

The identityOwner specifies which resource maintains the credentials required to query the CloudWatch Metrics API. Ideally, this parameter should be set to operator, as the KEDA Operator should be configured with the appropriate permissions to query CloudWatch Metrics.

The expression parameter allows you to use direct expression syntax the same as you would in the AWS CloudWatch Metrics console, without the need to provide dimensionName or dimensionValue parameters. It also enables you to perform more complex queries to attain the values you wish to use when scaling your resources.

Example

triggers:
- type: aws-cloudwatch
  metadata:
    namespace: AWS/SQS # Required: namespace
    dimensionName: QueueName # Required if not using expression query
    dimensionValue: keda # Required if not using expression query
    # Optional: Expression query - Required if not using dimension name & value
    expression: SELECT MAX("ApproximateNumberOfMessagesVisible") FROM "AWS/SQS" WHERE QueueName = 'keda'
    metricName: ApproximateNumberOfMessagesVisible
    targetMetricValue: "2.1"
    minMetricValue: "1.5"
    awsRegion: "eu-west-1" # Required: region
    awsAccessKeyIDFromEnv: AWS_ACCESS_KEY_ID # Optional: AWS Access Key ID, can use TriggerAuthentication as well (default AWS_ACCESS_KEY_ID)
    awsSecretAccessKeyFromEnv: AWS_SECRET_ACCESS_KEY # Optional: AWS Secret Access Key, can use TriggerAuthentication as well (default AWS_SECRET_ACCESS_KEY)
    identityOwner: pod | operator # Optional. Default: pod
    metricCollectionTime: "300" # Optional: Collection Time (default 300)
    metricStat: "Average" # Optional: Metric Statistic (default "Average")
    metricStatPeriod: "300" # Optional: Metric Statistic Period (default 300)
    metricUnit: "Count" # Optional: Metric Unit (default "")
    metricEndTimeOffset: "60" # Optional: Metric EndTime Offset (default 0)

For more detail on the use of the other optional trigger parameters, see the proper documentation for the aws-cloudwatch scaler.

TargetGroup Latency

In this example, we will scale an application by evaluating TargetGroup metrics queried from CloudWatch. The challenge here is that target groups created by the AWS Load Balancer Controller will have unique IDs applied as a suffix to the name, making it unpredictable when deploying a new ingress resource. To compute these values, we will use the lookup() function provided by Helm to query the Kubernetes API and pass our target group name to the proper trigger query in the ScaledObject manifest.

To start, at the beginning of your ScaledObject template, query a list of deployed TargetGroupBindings using the lookup() function and store the results to a new variable.

Example

{{- $targetGroups := (lookup "elbv2.k8s.aws/v1beta1" "TargetGroupBinding" "" "").items }}

Once you have the list stored, you can later recall this list by searching for the service name that matches the deployment from your scaleTargetRef.

Example

  {{- range $index, $group := $targetGroups }}
  {{- if eq .spec.serviceRef.name ( printf "%s-%s" $root.Release.Name $app ) }}
  dimensionValue: {{ .metadata.name }}
  {{- end }}
  {{- end }}

Once deployed, Helm will query the Kubernetes API to retrieve the list of TargetGroupBindings, and fill the dimensionValue with the matching target group name provided it is able to find a viable match.

Below is a full Helm template of the ScaledObject manifest for reference.

Example

{{- $root := . }}
{{- if .Values.autoscaler.enabled }}
{{- if $root.Capabilities.APIVersions.Has "keda.sh/v1alpha1" }}
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: {{ $root.Release.Name }}
  namespace: {{ $root.Release.Namespace }}
  labels:
    {{- include "my-chart.labels" . | nindent 4 }}
spec:
  scaleTargetRef:
    name: {{ .Values.autoscaler.scaleTargetRef.name | default ( include "my-chart.fullname" . )  }}
    kind: {{ .Values.autoscaler.scaleTargetRef.type | default "Deployment" }}
  pollingInterval:  {{ .Values.autoscaler.pollingInterval | default 30 }}
  cooldownPeriod:   {{ .Values.autoscaler.cooldownPeriod | default 300 }}
  idleReplicaCount: {{ .Values.autoscaler.idleReplicaCount | default 1 }}
  minReplicaCount:  {{ .Values.autoscaler.minReplicaCount | default 3 }}
  maxReplicaCount:  {{ .Values.autoscaler.maxReplicaCount | default 100 }}
  fallback:
    failureThreshold: {{ .Values.autoscaler.fallback.failureThreshold | default 3 }}
    replicas: {{ .Values.autoscaler.fallback.replicas | default 5 }}
  advanced:
    restoreToOriginalReplicaCount: true
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
  triggers:
  {{- with .Values.autoscaler.triggers }}
  {{ toYaml . | nindent 2 }}
  {{- end }}
  {{- with .Values.autoscaler.alb_metric }}
  {{- $targetGroups := (lookup "elbv2.k8s.aws/v1beta1" "TargetGroupBinding" "" "").items }}
    {{- range $index, $group := $targetGroups }}
    {{- if eq .spec.serviceRef.name ( include "my-chart.fullname" $root ) }}
    - type: aws-cloudwatch
      metadata:
        namespace: AWS/ApplicationELB
        dimensionName: TargetGroup
        dimensionValue: {{ .metadata.name }}
        metricName: TargetResponseTime
        targetMetricValue: {{ quote .Values.autoscaler.targetMetricValue | default "100"}}
        minMetricValue: {{ quote .Values.autoscaler.minMetricValue | default "0" }}
        awsRegion: {{ .Values.autoscaler.awsRegion | default "us-east-1" }}
        identityOwner: operator
    {{- end }}
    {{- end }}
  {{- end }}
  {{- end }}

---
{{- end }}