12.4 LoadBalancer: Cloud Provider Integration

Right, so you’ve arrived at the LoadBalancer. This is where Kubernetes gets off its high horse of elegant abstraction and says, “Fine, you want to talk to the real world? Let’s call your cloud provider.” A LoadBalancer service is essentially a NodePort service with a superpower: it knows how to automatically phone home to your cloud platform (AWS, GCP, Azure, etc.) and ask it to provision a brand-spanking-new external load balancer pointed right at your service.

This is the “just make it work” button. You define it, and a few moments later, you get a public IP or DNS name that distributes traffic to all the healthy pods in your service. It’s magic. Expensive magic, but magic nonetheless.

The Cloud Provider Machinery

Here’s the thing Kubernetes the project doesn’t tell you: it doesn’t know how to create load balancers. Zero. Zilch. It outsources that job to a separate controller that runs in your cluster, often called the Cloud Controller Manager (CCM). This CCM is provided by your cloud provider. It’s their code, watching the Kubernetes API for Service objects with type: LoadBalancer. When it sees one, it springs into action.

It translates your Kubernetes Service manifest into a native API call to its own platform. In AWS, it creates a Network Load Balancer (NLB) or Classic Load Balancer (ELB). In GCP, it creates a Network Load Balancer. In Azure, it’s an Azure Load Balancer. This is why the LoadBalancer type is fundamentally a cloud-locked feature. Try this on your local minikube or kind cluster, and it will “work” but just hang forever in Pending because there’s no cloud provider to answer the call. Those tools often have their own workarounds, like minikube tunnel, which is basically a clever hack to mimic this behavior.

A Basic, Boring, and Effective Example

Let’s define one. It looks almost identical to a NodePort service, but the type changes.

apiVersion: v1
kind: Service
metadata:
  name: my-awesome-app-loadbalancer
spec:
  selector:
    app: my-awesome-app
  ports:
    - protocol: TCP
      port: 80          # The port the Load Balancer listens on
      targetPort: 8080  # The port the pods are listening on
  type: LoadBalancer

Apply this manifest in AWS, and after 30-120 seconds (cloud providers are not known for their haste), the kubectl get svc output will show an EXTERNAL-IP. That’s your public endpoint.

$ kubectl get svc my-awesome-app-loadbalancer
NAME                             TYPE           CLUSTER-IP       EXTERNAL-IP                                                              PORT(S)        AGE
my-awesome-app-loadbalancer      LoadBalancer   10.100.XXX.XXX   a01234567890abcdef1234567890abcd-1234567890.us-west-2.elb.amazonaws.com   80:32456/TCP   2m

The Devil in the Details: Annotations and Spec

This is where the designers said, “Every cloud is different, so instead of a clean API, let’s use annotations.” It’s a pragmatic, if ugly, choice. You use provider-specific annotations to configure the heck out of the underlying load balancer.

Want an AWS Network Load Balancer (NLB) instead of the classic? Want to set idle connection timeouts? SSL certificates? Internal vs. external facing? It’s all in the annotations.

apiVersion: v1
kind: Service
metadata:
  name: my-nlb-service
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb" # Specific to AWS!
    service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60" # Also AWS
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 443
      targetPort: 8443
  type: LoadBalancer

The loadBalancerIP field in the spec is a cruel joke on some platforms (like AWS) where the IP is managed entirely by the cloud provider and you can’t request a specific one here. On others (like GCP and Azure), you can sometimes specify a pre-reserved static IP. You must consult your cloud provider’s documentation, which is the unofficial addendum to the Kubernetes docs.

The Cost of Convenience (The Big Gotcha)

Here’s the rough edge I need you to understand: every LoadBalancer service typically provisions a new cloud load balancer instance. Those things are expensive. They are not just IP addresses; they are full-blown, highly available infrastructure components with hourly costs and data processing charges.

If you naively create one Service per application, your cloud bill will look like a telephone number from the 1980s. The best practice, the one we use in the trenches, is to use an Ingress controller. You provision one LoadBalancer service for your ingress controller (like NGINX, Traefik, or ALB Controller). Then, you define Ingress resources for your individual applications. Those Ingress rules are routed through that single, expensive load balancer, saving you a small fortune. The LoadBalancer service is your gateway to the world, but you shouldn’t need a new gateway for every house in your neighborhood.