22.5 NAT Gateway: Outbound Internet for Private Subnets

Right, so you’ve built this pristine private subnet. Your application servers are tucked safely away, shielded from the random drive-by scans of the internet. It’s a fortress. But then you realize your little fortress-dwellers are getting a bit stir-crazy. They need to phone home, download security patches, call an API, or maybe just check if there’s a new cat video on YouTube. They need outbound internet access.

This is where the NAT Gateway comes in. It’s the single, controlled, heavily fortified exit door for your private subnet. Think of it as the drawbridge. Your instances can send traffic out, but the internet can’t initiate a conversation back in. It’s a one-way street, and it’s brilliant for security.

Why a NAT Gateway Isn’t Just a Fancy Router

You might be thinking, “I have an Internet Gateway for my public subnet. Isn’t that enough?” Nope. An Internet Gateway is a two-way door. It allows inbound traffic if you explicitly allow it via a Security Group or NACL. For a private subnet, we want zero inbound initiation from the internet. A NAT Gateway is a different beast entirely; it’s a managed network address translation service. AWS handles the scaling, the redundancy, and the messy TCP connection tracking for you. You just create it, point your route to it, and forget it. It’s one of those rare AWS services that “just works” and is worth every penny.

The magic is in the routing. Your private subnet’s route table doesn’t have a route to 0.0.0.0/0 through the Internet Gateway (igw-). That would make it public. Instead, it points that traffic to the NAT Gateway (nat-).

Where to Put This Thing (The Only Tricky Part)

This is the part everyone messes up, so pay attention. A NAT Gateway is not an instance you launch. It’s a logical resource you place… in a specific subnet. And not just any subnet. It must be placed in a public subnet. Why? Because the NAT Gateway itself needs a path to the internet to do its job. It uses the public subnet’s route table, which sends 0.0.0.0/0 traffic to an Internet Gateway.

Let’s visualize this. Your architecture should look like this:

Public Subnet A: Contains your NAT Gateway. Its route table points 0.0.0.0/0 -> igw-12345.
Private Subnet B: Contains your application instances. Its route table points 0.0.0.0/0 -> nat-abcde.

The NAT Gateway is the bridge between these two worlds. Your private instance makes a request, the route table sends it to the NAT Gateway in the public subnet, which then translates the source IP to its own public IP and sends the traffic out the Internet Gateway.

Let’s Build One: The Boring but Necessary Code

First, you need that public subnet. I’ll assume you have a VPC and have already carved it up. Note the MapPublicIpOnLaunch true for the public subnet—this is for resources in that subnet, not the NAT Gateway itself, which gets its own auto-assigned public IP.

# variables.tf
variable "vpc_id" {}
variable "public_subnet_id" {} # Your existing public subnet
variable "private_route_table_id" {} # The route table for your private subnet

# nat.tf
resource "aws_eip" "nat" {
  domain = "vpc"
  tags = {
    Name = "my-nat-eip"
  }
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = var.public_subnet_id  # CRITICAL: This must be a public subnet

  tags = {
    Name = "my-main-nat-gw"
  }

  # To ensure proper ordering, it's recommended to add an explicit dependency
  # on the Internet Gateway for the VPC, though it's often implicit.
  depends_on = [aws_internet_gateway.example] # Reference your IGW here
}

# This is the most important part: the route rule for the PRIVATE subnet
resource "aws_route" "private_internet_access" {
  route_table_id         = var.private_route_table_id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.main.id
}

The Gotchas and Gray Areas

AZ Affinity: This is a big one. A NAT Gateway is redundantly built within a single Availability Zone. You create one per AZ for resilience. If you have private subnets in us-east-1a and us-east-1b, you need a NAT Gateway in a public subnet in each of those AZs. Your private subnet in 1a must use the NAT Gateway in 1a. Cross-AZ traffic through a NAT Gateway works but will incur data transfer charges and add latency. Don’t do it. Design for one NAT Gateway per AZ.
Bandwidth Limits: AWS won’t let you starve your neighbors. Each NAT Gateway can support up to 100 Gbps of bandwidth and 10 million concurrent connections. You’ll almost certainly be fine, but it’s good to know the hard limit exists.
The Cost of Forgetting: NAT Gateways aren’t free. There’s an hourly charge and a per-GB data processing charge. The most common cost oopsie is leaving one running in a dev environment you’re not using. Tie it to an auto-shutdown schedule or tear it down with your other infra.
No Security Groups: You don’t attach security groups to a NAT Gateway. Its security is controlled by the Network ACLs of the subnets it’s in and routes to. Your outbound traffic control is still, and always, governed by the security groups on your private instances. The NAT Gateway just faithfully passes along whatever traffic your instances are allowed to send.