31.7 Scaling Kinesis: Shard Splitting, Merging, and On-Demand Mode

Alright, let’s talk about making your Kinesis stream actually keep up with the real world. You built this thing to handle a firehose of data, but what happens when the firehose suddenly becomes a fire-nado? Or, more embarrassingly, when it turns into a gentle trickle and you’re paying for a firehose? That’s where scaling comes in, and Kinesis gives you two main levers to pull: the manual, surgical control of shard operations (splitting and merging) and the glorious, set-it-and-forget-it (but not really) chaos of On-Demand mode. Let’s get into it.

31.6 Kinesis vs SQS vs SNS vs EventBridge: Choosing the Right Service

Right, let’s settle this. You’re staring at the AWS console, your cursor hovering over a bewildering alphabet soup of services, and you’re thinking, “Which one of you beautiful, over-engineered monsters do I need?” Don’t worry, I’ve been there. Choosing between Kinesis, SQS, SNS, and EventBridge is less about finding the “best” one and more about matching the right tool to the job. Get it wrong, and you’ll be trying to hammer in a nail with a flamethrower. Effective, but messy and wildly inefficient.

31.5 Kinesis Data Analytics: SQL and Apache Flink on Streaming Data

Right, so you’ve got a Kinesis Data Stream humming along, dutifully shoveling data into Firehose or maybe an S3 bucket. That’s fine. It’s the data equivalent of putting everything in a big box to sort through later. But what if you need to know what’s in the box now? Not in five minutes, not after a Lambda runs, but right this second. That’s where Kinesis Data Analytics (KDA) comes in. Think of it as your SQL-speaking, caffeine-addled analyst who can look at a firehose of data and tell you the running average, the top trending items, or an emerging anomaly, all in real-time. It’s SQL (or Flink Java/Scala) on live data, and it’s shockingly powerful once you get your head around it.

31.4 Kinesis Data Firehose: Managed Delivery to S3, Redshift, OpenSearch, Splunk

Right, so you’ve got data streaming in, and you need to get it somewhere for storage or analysis. Kinesis Data Streams is the raw firehose; Kinesis Data Firehose is the attachment that aims it for you. Think of it as the difference between a pile of lumber and a pre-fab IKEA bookshelf. One gives you ultimate flexibility (and a lot of work), the other gets the job done quickly, albeit with some… interesting design choices.

31.3 Kinesis Client Library (KCL) and Lambda Trigger Integration

Right, so you’ve got your Kinesis Data Stream humming along, shoveling data records like there’s no tomorrow. The next question is the fun one: how do you actually consume this firehose without building a complex, state-managing, shard-balancing monster of a service? You’ve got two primary flavors: run the Kinesis Client Library (KCL) yourself on a fleet of EC2 instances, or let AWS do the heavy lifting with a Lambda trigger. I’m going to assume you’re here because you prefer “less servers” to “more servers,” so let’s dive into the Lambda integration. It’s brilliant, but it has its own… idiosyncrasies.

31.2 Producer and Consumer APIs: PutRecord, GetRecords, and Enhanced Fan-Out

Alright, let’s talk about getting data in and out of Kinesis. This is where the rubber meets the road, or more accurately, where your events meet the stream. The API surface here is deceptively simple, which is both a blessing and a curse. A blessing because you can get started in minutes; a curse because the real devil is in the details of scaling, error handling, and not accidentally setting your wallet on fire with the bill for Enhanced Fan-Out.

31.1 Kinesis Data Streams: Shards, Records, Partition Keys, and Sequence Numbers

Right, let’s talk about Kinesis Data Streams. Think of it as Amazon’s answer to “what if we built a super-scalable, durable log, but put it on a credit card and made you pay for every single byte that moves through it?” It’s a fantastic service, but you need to understand its moving parts or you’ll either overpay, underperform, or accidentally lose data. And I refuse to let that happen to you.

— joke —

...