26.4 ECS Services: Desired Count, Load Balancer Integration, and Service Discovery

Right, so you’ve got your task definition. It’s the blueprint. Now we need to actually run the thing, keep it alive, and let the world talk to it. That’s the job of the ECS Service. Think of it as the hyper-competent foreman on a construction site who doesn’t just build one house from your blueprint, but makes sure exactly N houses are always standing, even if termites (read: crashing containers) take one out.

The desired count is the beating heart of the service. Set it to 2, and ECS will fight like hell to keep two instances of your task running at all times. It’s a simple number, but it’s the source of most of your self-healing magic. A task dies? ECS sees the count is 1, so it immediately spins up a new one to get back to 2. It’s beautifully, stupidly simple. But this is also where people get into trouble. Set it to 1 for a production service? You’re just asking for downtime during a deployment or a random failure. Set it to 50 and forget to configure scaling? You’re about to get a very expensive bill and a lesson in attentiveness.

Integrating with a Load Balancer

You don’t want users hitting individual, ephemeral containers directly. That’s what the load balancer is for. You point your DNS (my-app.example.com) to the ALB/NLB, and the ECS service makes sure the ALB knows about all the healthy containers it’s managing. The integration is almost comically elegant when it works.

You define this in the service’s loadBalancers section. The key part is the containerName and containerPort mapping. This is AWS’s way of asking, “Hey, which of the containers in your task (you are running multiple in a single task, right? …right?) and which port on that container should the LB send traffic to?” Get this wrong, and the LB health checks will fail, leaving you with a perfectly running service that is utterly unreachable. It’s the cloud equivalent of locking yourself out of your own car.

Here’s what a CloudFormation snippet for a service with an ALB attachment looks like:

MyWebService:
  Type: AWS::ECS::Service
  Properties:
    Cluster: !Ref MyCluster
    TaskDefinition: !Ref MyTaskDefinition
    DesiredCount: 2
    LaunchType: FARGATE
    NetworkConfiguration:
      AwsvpcConfiguration:
        AssignPublicIp: ENABLED
        SecurityGroups:
          - !Ref MyContainerSecurityGroup
        Subnets:
          - subnet-12345abc
          - subnet-67890def
    LoadBalancers:
      - TargetGroupArn: !Ref MyTargetGroup
        ContainerName: "my-web-app" # Must match the name in the task definition
        ContainerPort: 8080 # Must match the port the container is listening on

The magic here is that when a task starts, the ECS agent tells the load balancer’s target group, “Hey, new IP here, on this port, start sending it traffic.” When the task stops, it does the reverse. It’s dynamic service registration without you having to run a single script.

Service Discovery: For When You’re Fancy

Sometimes you don’t need a public-facing load balancer. Maybe your backend API service just needs to talk to your authentication service. For this, you use Service Discovery, which is essentially a built-in, managed DNS-based load balancer for your VPC. It’s fantastic for internal microservices communication.

You enable it, and ECS will automatically register your tasks into a private namespace (e.g., my-svc.default.cluster.local) with friendly DNS names like myservice.nginx. Other tasks in your VPC can just resolve myservice.nginx and get back all the IPs of the healthy tasks, load balanced via Route 53. It’s shockingly easy to set up for what it does.

MyInternalService:
  Type: AWS::ECS::Service
  Properties:
    # ... other standard service properties ...
    ServiceRegistries:
      - RegistryArn: !GetAtt MyServiceDiscoveryService.Arn

The pitfall? Cost. Each service discovery endpoint is a Route 53 resource that incurs a small monthly charge. It adds up if you have hundreds of services. Also, the TTLs on the DNS records are hardcoded by AWS and can’t be changed last I checked, which is a baffling omission. It’s fine for most use cases, but if you need sub-second service registration/deregistration, you might still need something like Consul.

The Deployment Gotcha: Your Desired Count is King

Here’s the bit that trips up everyone the first time. You have a service with a desired count of 2. You update the service to use a new task definition. What happens? ECS does a rolling deployment. It takes down one old task, brings up one new one, waits for it to be healthy, then repeats the process. Perfect.

Now, what if you need to quickly roll back? You change the service’s task definition back to the old one. ECS will… do another rolling deployment. It will replace your two running tasks with the old version. This causes unnecessary churn. The smarter move is to simply stop the update. How? You set the desired count to 0, wait for the tasks to stop, then set it back to 2. The service, seeing no running tasks, will launch new ones using the current active task definition, which is the old, stable one. It feels like a hack, but it’s the most direct way to abort a bad deploy without causing more container drama. Remember: the service’s job is to meet the desired count with the defined task. Use that to your advantage.