Right, let’s settle this. You’re staring at the AWS console, your cursor hovering over a bewildering alphabet soup of services, and you’re thinking, “Which one of you beautiful, over-engineered monsters do I need?” Don’t worry, I’ve been there. Choosing between Kinesis, SQS, SNS, and EventBridge is less about finding the “best” one and more about matching the right tool to the job. Get it wrong, and you’ll be trying to hammer in a nail with a flamethrower. Effective, but messy and wildly inefficient.

The core of the confusion comes from a simple fact: they all move data from point A to point B. The how and the why are what separate them. Let’s break it down.

The Quick-Start Guide: A Mental Model

Think of it like this:

  • SQS (Simple Queue Service) is a point-to-point, pull-based queue. It’s a message queue in the classic sense. You have one producer and one group of consumers. A message is processed by one consumer and then it’s gone. It’s for asynchronous, decoupled processing. Think: offloading a time-consuming image resizing task from your web server to a fleet of worker EC2 instances.
  • SNS (Simple Notification Service) is a pub/sub fan-out system. One message from a producer is pushed to multiple subscribers simultaneously. Those subscribers can be SQS queues, HTTP endpoints, Lambdas, emails, you name it. It’s for broadcasting. Think: an order confirmation event that needs to trigger a database update, send a confirmation email, and ping a logistics API—all at once.
  • EventBridge is SNS’s more sophisticated, event-driven cousin. It’s a serverless event bus that can route events based on content (not just a dumb fan-out). Its real power is in ingesting events from AWS services (like an S3 bucket upload) and routing them to over 100 SaaS partners (like Datadog or PagerDuty) or your own targets based on powerful rules. It’s for building reactive, event-driven architectures across different systems.
  • Kinesis Data Streams is a real-time, ordered data streaming platform. It’s not a queue you process and delete from; it’s a continuous, append-only log of data records. Multiple applications can read from the same stream simultaneously (at their own pace!) to do different things. It’s for real-time analytics, monitoring, and processing continuous data like clickstreams, logs, and telemetry.

When to Use Which: A Practical Guide

Use SQS when you need simple, reliable, asynchronous job processing. You want to decouple your web tier from your worker tier and ensure each unit of work is handled exactly once (or at least once, if you’re okay with that). Its retention period is up to 14 days, which is great for dealing with backlogs.

# Producer: Your web app adds a task to the queue
import boto3
sqs = boto3.client('sqs')
queue_url = 'YOUR_QUEUE_URL'

response = sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='{"image_id": "img_123", "size": ["small", "medium"]}'
)
print(f"Message ID: {response['MessageId']}")

Use SNS when you need to fan-out a single event to multiple, independent systems. It’s fire-and-forget; SNS doesn’t care if the subscribers are up. If a subscriber is down, the message is lost unless that subscriber is a queue (a very common and powerful pattern).

# Subscribing an SQS queue to an SNS topic is the classic resiliency pattern
import boto3
sns = boto3.client('sns')
topic_arn = 'YOUR_TOPIC_ARN'
queue_arn = 'YOUR_QUEUE_ARN'

response = sns.subscribe(
    TopicArn=topic_arn,
    Protocol='sqs',
    Endpoint=queue_arn
)
# Now any message published to the SNS topic will be pushed to the SQS queue.

Use EventBridge when you’re reacting to events, especially from AWS services, or building complex, content-based routing logic across AWS and third-party services. Want to trigger a Lambda every time a new EC2 instance of type t3.large is launched in us-east-1? EventBridge rules are your answer. It’s the glue for a truly event-driven AWS ecosystem.

Use Kinesis Data Streams when you’re dealing with high-volume, real-time data that needs to be processed by multiple consumers in order. The key differentiators are order and replayability. Need to run a real-time dashboard, feed your ML model, and archive raw data all from the same firehose of data? Kinesis lets you do that. You’re not just sending a message; you’re building on a continuous, living stream of truth.

# Putting a record into a Kinesis data stream
import json
import boto3
kinesis = boto3.client('kinesis')
stream_name = 'my-clickstream'

record = {
    'user_id': 'user_123',
    'page_url': '/product/shoes',
    'timestamp': '2023-10-27T16:00:00Z'
}

response = kinesis.put_record(
    StreamName=stream_name,
    Data=json.dumps(record),
    PartitionKey='user_123'  # Critical for ordering all records for this user
)

The Pitfalls and “Gotchas”

  • SNS doesn’t guarantee order. At all. If you care about the sequence of events, do not use raw SNS. Use SQS FIFO (First-In-First-Out) queues as subscribers, or use Kinesis.
  • SQS Standard queues are “at-least-once” delivery. This means, in rare edge cases, you might get the same message twice. Your consumer logic must be idempotent—processing the same message twice should not cause problems. If you can’t handle that, use SQS FIFO for “exactly-once” processing.
  • Kinesis is not a queue. It’s a durable log. This is the biggest mental hurdle. You manage consumers via shard iterators, and you have to think about throughput at the shard level (1MB/s or 1000 records/s per shard for writes, 2MB/s per shard for reads). If you don’t understand partitions and scaling, you will run into provisioning errors (ProvisionedThroughputExceededException), and it will hurt.
  • EventBridge has a default limit of 100 rules per event bus. It feels low because it is. You can request an increase, but it’s a classic example of AWS’s “defaults are for beginners” philosophy that bites experienced users. Always check the service quotas first.

The most powerful architectures combine these services. The classic pattern? SNS -> SQS. Let SNS handle the fan-out to multiple SQS queues, and let each queue provide durability and throttling for its consumer service. Or Kinesis -> Lambda, using the stream as the durable source of truth and Lambda to process records in batches. Choose the tool for the job, and don’t be afraid to use more than one.