Replication

14.8 S3 Batch Operations: Processing Millions of Objects at Scale

Right, so you’ve got a few million objects sitting in a bucket. Maybe you need to change their storage class, add tags, or copy them to another bucket. You’re not going to do that by hand, are you? Of course not. You’re going to fire up S3 Batch Operations, which is essentially your personal robot army for S3 object management. It’s the tool you use when a simple aws s3 sync just won’t cut the mustard and you’d rather not write a bespoke Lambda function to handle the sheer scale.

14.7 S3 Object Lambda: Transforming Data On the Fly During GET

Right, so you’ve got your data sitting in S3. It’s pristine, it’s perfect. But then the requests start rolling in. “Can we get this CSV file as JSON?” “I need this image as a WebP, not a PNG.” “Can we redact the personally identifiable information (PII) from this document before my user sees it?” The old, tedious way would be to create a whole ETL pipeline: trigger a Lambda on upload to transform the object into every possible format, store them all, and then hope you guessed right what the user would need. It’s wasteful, it’s expensive, and it’s frankly a bit daft. It’s like cooking every item on the menu the second a customer walks in, just in case they order it.

14.6 Presigned URLs: Granting Temporary Access Without AWS Credentials

Right, let’s talk about one of the most useful Swiss Army knives in the S3 toolkit: the presigned URL. Here’s the core problem it solves: you have an object in a private bucket. You want to let someone—a user on your website, a colleague, a third-party—download it (or upload it) without giving them your precious, all-powerful AWS credentials. You also don’t want to make the bucket public and unleash chaos upon the world.

14.5 S3 Event Notifications: Triggering Lambda, SQS, SNS on Object Events

Right, so you’ve got your data sitting in S3. Great. But static data is, well, static. The real magic happens when your buckets can tell you things, when they can raise their digital hand and say, “Hey, a new file just landed,” or “Psst, someone deleted that important report.” That’s S3 Event Notifications. It’s how you turn a dumb storage bin into the central nervous system of your data pipeline.

14.4 S3 Replication: CRR and SRR, Replication Rules, and IAM Role Requirements

Right, let’s talk about S3 Replication. This is the feature that stops you from having a single, catastrophic “oops” moment with your data. The core idea is simple: when you drop a file into one bucket, S3 can automatically and asynchronously copy it to another bucket for you. But as with most things in AWS, the devil is in the details, and oh boy, are there details. The first fork in the road is choosing your replication type. You’ve got Cross-Region Replication (CRR) and Same-Region Replication (SRR). The names are admirably self-explanatory. CRR is for disaster recovery, keeping your data a safe distance away from a regional meteor strike or, more likely, a configuration apocalypse. SRR is your go-to for operational reasons: maybe you need to aggregate logs from different accounts into a single bucket, or you’re creating a strict production/staging separation where your staging environment needs a real-time copy of production data without the risk of it mucking about in the actual production bucket.

14.3 Lifecycle Rules: Transitioning and Expiring Objects by Age or Prefix

Right, so you’ve got your data in S3. Great. But unless you’re made of money and enjoy watching your CFO have an aneurysm, you can’t just leave every single file on the expensive, high-performance storage tier forever. This is where lifecycle rules come in. Think of them as your automated, hyper-efficient storage janitor. They quietly go about their business, moving things to cheaper storage or taking out the trash, all so you don’t have to.

14.2 MFA Delete: Extra Protection for Version Deletion

Alright, let’s talk about MFA Delete. You know Multi-Factor Authentication from logging into your corporate VPN or your email, right? It’s that “something you have and something you know” principle. Well, AWS, in a rare moment of genuine security foresight, decided to apply that same concept to one of the most destructive operations in S3: permanently deleting object versions. Here’s the deal: S3 Versioning is fantastic. It’s your “undo button” for the cloud. But that “undo button” itself has a big, scary, permanent “redo button” called DeleteObject or DeleteVersion. Anyone with the s3:DeleteObject permission can wipe out a version, and if they nuke all the versions of an object, it’s gone for good. MFA Delete adds a crucial second factor. Even if a bad actor gets hold of your access keys, or you accidentally grant too much permission to an IAM role (it happens to the best of us), they can’t just waltz in and delete your data without also physically possessing your MFA device.

14.1 Versioning: Enabling, Suspending, and Permanent Delete with Version ID

Right, let’s talk about S3 Versioning. This is one of those features that sounds simple on the surface—“it keeps multiple versions of an object”—but the devil, as always, is in the details. And the AWS console does its best to hide those details from you, which is why we’re having this chat. Think of versioning as the ultimate “undo” button for your bucket, but an undo button that, by default, just keeps every single change you’ve ever made, forever. This is fantastic for recovery, less fantastic for your storage bill.

14. S3 Versioning, Lifecycle, Replication, Events, and Presigned URLs

34.7 Failover and Promotion: Turning a Standby into Primary

Right, so your primary server has decided to take an unscheduled vacation. It’s crashed, it’s gone, it’s pining for the fjords. The show must go on, and that means you need to promote one of your standby servers to take its place. This isn’t just a configuration change; it’s a state change. You’re telling a replica, “Stop following orders and start giving them.” It’s a big moment, and doing it correctly is the difference between a smooth transition and a full-blown, why-is-the-database-down-on-a-Tuesday disaster.

34.6 pg_basebackup: Creating a Base Backup

Right, let’s talk about pg_basebackup. This is the workhorse, the trusty mule of PostgreSQL physical backups. It’s not glamorous, but when the proverbial fan gets clogged, this is the tool you’ll be desperately glad you set up correctly. In essence, pg_basebackup does one thing and does it well: it connects to a running PostgreSQL server and pulls a complete, bit-for-bit copy of the entire data directory (and optionally, the WAL archive) to create a perfect physical base backup. This is the literal foundation of any Point-in-Time Recovery (PITR) strategy. You can’t do PITR without one of these.

34.5 Replication Slots: Ensuring WAL Retention

Right, let’s talk about replication slots. You’re probably here because you’ve seen the dreaded WARNING: oldest xmin is far in the past or, worse, a standby has fallen off the wagon because it couldn’t get the WAL files it needed. Replication slots are the solution to that second, more insidious problem. They’re a way to tell the primary server, “Hey, don’t you dare delete that WAL file until you are absolutely, positively sure my standby has consumed it.”

34.4 Logical Replication: Publications and Subscriptions

Right, so you’ve outgrown streaming replication. You need to replicate only a subset of your data, or maybe you’re doing a major version upgrade without downtime. Welcome to logical replication, the grown-up version of “just copy the whole data directory.” Instead of blindly shipping every bit and byte, it streams a log of the actual data-changing operations (INSERT, UPDATE, DELETE) from one database to another. It’s smarter, more flexible, and consequently, a bit more hands-on.

34.3 Synchronous vs Asynchronous Replication

Right, let’s settle the great debate: should your replica be a dutiful, “Yes, sir, right away, sir!” subordinate, or more of a “I’ll get to it when I get to it” kind of background process? This is the core of synchronous versus asynchronous replication, and the choice is far more profound than a simple checkbox. It’s a trade-off between absolute data safety and raw performance, and getting it wrong can lead to some spectacularly unpleasant outcomes.

34.2 Streaming Replication Setup: primary_conninfo and recovery.conf

Right, let’s get your standby server listening to the primary. This isn’t just about copying files; it’s about creating a hotline between the two. The secret handshake for this connection is defined in two places: the primary_conninfo string on the standby and the recovery.conf (or postgresql.conf in v12+) file that tells a server “hey, you’re not the main character in this story.” First, a moment of silence for recovery.conf. In PostgreSQL 12, the designers, in their infinite wisdom, decided to merge its parameters into postgresql.conf. It’s a cleaner approach long-term, but it means you need to know which version you’re on. I’ll show you both, because I’m a brilliant friend and that’s what we do.

34.1 WAL-Based Replication: The Fundamentals

Right, let’s talk about the beating heart of any serious PostgreSQL setup: replication. Forget the marketing fluff; this is how you turn a single, lonely database server into a resilient, scalable system. And it all starts with the Write-Ahead Log, or WAL. If your database is a novel, the WAL is the continuous, unbroken stream of every single edit the author ever made. We don’t just copy the final book; we stream the entire writing process. This is the fundamental concept behind WAL-based replication, and it’s brilliantly simple and robust.