39.7 KEDA on AKS: Event-Driven Scaling with Azure Services
Right, so you’ve got your AKS cluster humming along. You’ve probably set up the Horizontal Pod Autoscaler (HPA) to scale based on CPU or memory, and you’re feeling pretty good about yourself. And you should. But let’s be honest: most of the interesting stuff that happens in the cloud isn’t a slow, steady trickle of CPU load. It’s a sudden, screaming torrent of events. A million messages piling up in a Service Bus queue. A thousand new blobs dropped in a Storage account. A massive backlog in Azure Event Hubs. Your statically-provisioned pods are just sitting there, blissfully unaware of the incoming tidal wave. This is where KEDA, the Kubernetes Event-Driven Autoscaler, comes in to save the day. It’s the nervous system that connects your pod scale to the actual work that needs to be done.
Think of KEDA not as a replacement for HPA, but as its much smarter, event-aware cousin. KEDA sits on top of HPA, watches your chosen event source (we call these “scalers”), and then drives the standard HPA for you. It translates the language of events (“Holy crap, there are 5000 messages in this queue!”) into the language HPA understands (“Scale the deployment to 10 replicas”). It’s brilliant because it leverages a battle-tested Kubernetes primitive (HPA) and just gives it superpowers.
Installing KEDA with Helm (The Right Way)
You can install KEDA with a simple kubectl apply, but don’t. Use Helm. It manages the lifecycle, the CRDs, and the configuration for you, which you’ll thank me for later when you need to upgrade or tweak something. First, add the repo.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
Now, install it. I like putting it in its own dedicated namespace to keep things tidy. The keda namespace is the standard, so let’s stick with that.
# Create the namespace
kubectl create namespace keda
# Install KEDA
helm install keda kedacore/keda --namespace keda
Run a kubectl get pods -n keda and you should see two pods: the KEDA operator and the metrics API server. They’re the dynamic duo that make all this magic happen.
The Core Concept: The ScaledObject
KEDA’s magic is configured through a custom resource called a ScaledObject. This is where you draw the line from your event source to your Kubernetes Deployment (or StatefulSet, or Job). You don’t point it at a Pod directly; you point it at a higher-level controller. Here’s the anatomy of a classic example: scaling a worker deployment based on an Azure Storage Queue.
# scaledobject-queue.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor-scaler
namespace: default
spec:
scaleTargetRef:
name: order-processor # This must be your Deployment name
pollingInterval: 30 # How often to check the queue (in seconds). 30 is sane.
cooldownPeriod: 300 # How long to wait after a queue is empty before scaling back down. Prevents thrashing.
minReplicaCount: 0 # The best part! Scale to zero when there's no work.
maxReplicaCount: 10 # Don't let it go completely bananas and bankrupt you.
triggers:
- type: azure-queue
metadata:
accountName: myawesomeappstorage # The storage account name
queueName: ordersqueue # The specific queue to watch
queueLength: "5" # The magic number: desired count of messages per pod replica
# The following secrets can be set in the TriggerAuthentication or directly (not recommended)
connectionFromEnv: AZURE_STORAGE_CONNECTION_STRING # Reference to a secret key
Let’s talk about that queueLength: "5" because this is where people mess up. This doesn’t mean “scale up when the queue length is more than 5.” It means “I want each pod replica to handle, on average, about 5 messages.” So if there are 50 messages in the queue, KEDA will tell HPA to scale to 10 replicas (50 messages / 5 per replica). You need to tune this number based on how long it takes your worker to process a single message. A long-running process? Set this lower (e.g., 1 or 2). A super-fast processor? Crank it up.
Authentication: Don’t Be an Idiot
You’ll notice I referenced connectionFromEnv in the YAML. You absolutely, positively, must not put your connection string directly in the ScaledObject. That’s a one-way ticket to credential leakage. Instead, you use a TriggerAuthentication or a Secret referenced by the deployment. Here’s the secure way to do it:
First, create a Kubernetes Secret with your Azure Storage connection string. Get the connection string from the Azure portal under “Access Keys” for your storage account.
kubectl create secret generic azure-secrets --from-literal=AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=...";
Then, your ScaledObject points to that secret.
# scaledobject-queue-secure.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor-scaler
spec:
scaleTargetRef:
name: order-processor
# ... other specs ...
triggers:
- type: azure-queue
metadata:
accountName: myawesomeappstorage
queueName: ordersqueue
queueLength: "5"
authenticationRef:
name: azure-trigger-auth # Reference the TriggerAuthentication object
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: azure-trigger-auth
spec:
secretTargetRef:
- parameter: connection
name: azure-secrets # The name of the secret we created
key: AZURE_STORAGE_CONNECTION_STRING # The key within that secret
This separates concerns beautifully and keeps your secrets out of your application logic.
Pitfalls and “Oh Crap” Moments
- The Cold Start: Scaling from zero isn’t free. When the first message hits the queue, KEDA has to scale out the deployment (which might involve pulling the container image if the node is new) and then the pod has to start. This adds latency. If you can’t tolerate that initial delay, set
minReplicaCount: 1to keep a warm pod ready. - Permission Denied: This is the number one issue. Your KEDA operator needs permissions to read the event source. If you’re using Azure Pod Identity or Workload Identity (which you should be), make sure the identity you’re using has the correct RBAC roles on the Azure resource (e.g., “Storage Queue Data Reader” on the queue).
- Logging is Your Friend: When things go sideways, check the KEDA operator logs (
kubectl logs -n keda deploy/keda-operator). It will loudly complain about bad connection strings, missing permissions, or misconfigured queue names. It’s very talkative when it’s unhappy, which is exactly what you want.
KEDA turns your AKS cluster from a static resource into a truly dynamic, event-driven beast that only consumes resources when there’s actual work to be done. It’s one of those tools that, once you use it, you’ll wonder how you ever lived without it. Now go configure it, mess it up, read the logs, fix it, and then watch in awe as your cluster breathes in and out with the rhythm of your application.