Alright, let’s get into the weeds on NACLs. If Security Groups are your bouncer, checking IDs at the door of your instance, then NACLs are the building’s security gate. They’re stateless, they work at the subnet level, and they have a set of numbered rules that they evaluate in order. This is where things get both powerful and, frankly, a bit silly if you’re not careful.

The single most important concept to burn into your brain is this: NACLs evaluate their numbered rules in ascending order, from the lowest number to the highest, until they find a match. The first rule that matches the traffic type is the one that gets applied, full stop. It doesn’t keep looking. This is why you can’t just slap rules in there willy-nilly; order is absolutely everything.

The Implicit Deny: The Bouncer’s “Not On The List”

Here’s the kicker, the thing that trips up everyone at least once: every NACL has a final, invisible rule called the implicit deny. It’s the last rule evaluated on every single packet, and it’s an asterisk (*) that says “DENY” for all traffic. You can’t see it in the console, but it’s there, lurking, ready to ruin your day if you forget about it.

Think of it this way: your NACL rule set is a list of exceptions to a default policy of “absolutely nothing gets through.” You are explicitly allowing certain traffic in spite of this default deny-all stance. If a packet doesn’t match any of your explicit ALLOW rules, it will inevitably hit this implicit deny and get unceremoniously dropped. This is the core of the “default deny” security model, and it’s actually a good thing—once you learn to work with it.

The Crucial Dance of Rule Numbers

Because rules are evaluated in order, you need to leave space between them. You’ll see why in a second. AWS lets you create rules from 1 to 32766. The smart move is to create rules in increments of 100. I use 100, 200, 300, etc. Why? Because it leaves you 99 empty slots between each rule. Trust me, your future self will thank you when you need to add a new rule between 100 and 200 and don’t have to renumber your entire ACL.

Let’s look at a classic pitfall. Imagine you want to allow HTTP and HTTPS traffic but block a specific malicious IP.

The WRONG way to do it (a tale of woe):

# This is a disaster waiting to happen.
# Rule 100: Allow HTTP from anywhere
# Rule 200: Allow HTTPS from anywhere
# Rule 150: Deny a specific IP address (Oops! Too late!)

See the problem? The traffic from that malicious IP comes in. Rule 100 is checked: “Is this HTTP traffic?” Yes, it is. The NACL goes “Cool, allowed!” and lets the packet through. It never even gets to rule 150. The order of evaluation made your deny rule completely useless.

The RIGHT way to do it:

You need to put your more specific deny rules before your broad allow rules. This is non-negotiable.

# This is the way.
# Rule 100: DENY this specific malicious IP address
# Rule 200: ALLOW HTTP from anywhere
# Rule 300: ALLOW HTTPS from anywhere

Now, the traffic hits rule 100 first. The NACL asks, “Is this from the bad IP?” If yes, it gets denied immediately and never reaches the permissive rules. Perfect.

The Stateless Headache

I mentioned NACLs are stateless. This is the biggest “questionable choice” in my book. It means you have to explicitly manage both the request and the response as completely separate pieces of traffic.

If you allow an inbound SSH request (port 22) from someone’s IP, you must also create a corresponding outbound rule to allow the response traffic to leave. The return traffic isn’t automatically allowed because the NACL has no memory of the original request. It’s just looking at each packet in isolation. This is the opposite of Security Groups, which are stateful and handle this for you automatically.

Here’s a minimal, runnable example for allowing SSH. Notice the need for two sets of rules.

Inbound Rules:

Rule #TypeProtocolPort RangeSourceAllow/Deny
100SSHTCP22192.0.2.1/32ALLOW
200SSHTCP22203.0.113.5/32ALLOW
32767All TrafficAllAll0.0.0.0/0DENY

Outbound Rules:

Rule #TypeProtocolPort RangeDestinationAllow/Deny
100CustomTCP32768-65535192.0.2.1/32ALLOW
200CustomTCP32768-65535203.0.113.5/32ALLOW
32767All TrafficAllAll0.0.0.0/0DENY

See that? The outbound rule isn’t for port 22. It’s for the ephemeral port range (32768-65535 on Linux) that the SSH daemon on your instance will use to send the response back to the client. You’re allowing the return traffic for your established connections. Forget this step, and the client will just sit there timing out, wondering what they did wrong.

Best Practices and Final Thoughts

  1. Use NACLs Sparingly: Their main use case is as a backstop for your Security Groups. Need to block an entire IP range at the subnet level? NACL. Need to block a specific port across every instance in a subnet? NACL. For 95% of your access control, you should be using Security Groups.
  2. Plan Your Numbering: Rules of 100, 200, 300. Always. It’s the closest thing to a universal best practice we have.
  3. Deny First, Allow Later: Put your explicit DENY rules at the top (low numbers) and your broad ALLOW rules further down (higher numbers).
  4. Remember the Ephemeral Ports: Any outbound rule for return traffic must account for the high-numbered ephemeral ports the response will originate from. This is the most common “it’s not working” issue.
  5. Test Relentlessly: Use tcpdump or VPC Flow Logs to see if traffic is hitting your instance and being dropped, or if it’s being stopped dead at the subnet gate by your NACL. It saves hours of guesswork.

Mastering NACL rule evaluation is about embracing the pedantry. It’s a blunt instrument that demands precision. Get the order wrong, and it’s useless. Get the statelessness wrong, and it’s broken. But get it right, and you have a powerful, subnet-wide layer of control that works exactly as advertised.