Right, let’s talk about S3 Replication. This is the feature that stops you from having a single, catastrophic “oops” moment with your data. The core idea is simple: when you drop a file into one bucket, S3 can automatically and asynchronously copy it to another bucket for you. But as with most things in AWS, the devil is in the details, and oh boy, are there details.

The first fork in the road is choosing your replication type. You’ve got Cross-Region Replication (CRR) and Same-Region Replication (SRR). The names are admirably self-explanatory. CRR is for disaster recovery, keeping your data a safe distance away from a regional meteor strike or, more likely, a configuration apocalypse. SRR is your go-to for operational reasons: maybe you need to aggregate logs from different accounts into a single bucket, or you’re creating a strict production/staging separation where your staging environment needs a real-time copy of production data without the risk of it mucking about in the actual production bucket.

The Almighty Replication Rule

You don’t just flip a “replicate” switch on the bucket. You create a Replication Rule. This is where you get specific. You can choose to replicate the entire bucket or use prefixes (e.g., photos/ or logs/) to be surgical about what gets copied. You can even use tags, which is incredibly useful for a multi-tenant setup where you might tag objects with a tenant-id and only replicate the data for paying customers.

The most critical, and most often bungled, part of the rule is the IAM Role. S3 needs permission to read objects from your source bucket and write them to your destination bucket. You must create an IAM role that trusts the s3.amazon- service principal and has a policy with these permissions. AWS will happily offer to create this role for you in the console wizard, which is great until you need to do it via Infrastructure-as-Code and realize you have no idea what permissions it actually granted. Let’s demystify it.

Here’s a Terraform example of a policy that’s actually correct. Notice it needs s3:GetObjectVersion and s3:GetObjectVersionAcl for versioned buckets, not just the standard s3:GetObject.

resource "aws_iam_role" "replication" {
  name = "s3-replication-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "s3.amazonaws.com"
        }
      },
    ]
  })
}

resource "aws_iam_policy" "replication" {
  name = "s3-replication-policy"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:GetObjectVersionForReplication",
          "s3:GetObjectVersionAcl",
          "s3:GetObjectVersionTagging"
        ]
        Resource = ["arn:aws:s3:::your-source-bucket/*"]
      },
      {
        Effect = "Allow"
        Action = [
          "s3:ReplicateObject",
          "s3:ReplicateDelete",
          "s3:ReplicateTags"
        ]
        Resource = ["arn:aws:s3:::your-destination-bucket/*"]
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "replication" {
  role       = aws_iam_role.replication.name
  policy_arn = aws_iam_policy.replication.arn
}

What Gets Replicated (And What Absolutely Does Not)

This is the part that will bite you. Replication is not magic; it’s a background job. It only replicates objects that are created after the rule is in place. Your existing 10 PB of cat videos? They stay put. You’ll need a separate tool like s3 batch operations to backfill those, which is a conversation for another day.

Also, by default, it only replicates new objects. If you update an existing object (like overwriting it), that action is not replicated. Let that sink in. You can change this behavior by enabling “Replicate existing objects” when you create the rule, but again, that’s a new operation and it uses S3 Batch Operations under the hood. It’s not instant.

Now, for the real kicker: deletes. If you delete an object (or a version of an object) from the source bucket, that delete is NOT replicated by default. This is a protective measure. If you want deletes to replicate—specifically, if you want delete markers to replicate—you have to explicitly add a “Replicate delete markers” option to your rule. This is a classic AWS “sensible default that complicates everything” choice.

The Gotchas and Rough Edges

  1. Replication Time: It’s asynchronous. It’s fast, but it’s not instantaneous. Don’t build a system that requires the object to be available in the destination bucket milliseconds after the source write.
  2. Charges: You pay for inter-region data transfer for CRR. It’s not free. SRR within the same region has no data transfer cost, but you still pay for the API requests.
  3. Object Lock: You cannot replicate from a bucket with Object Lock to one without it, or vice-versa. The retention and legal hold settings must be compatible. This is a hard stop.
  4. The Two-Bucket Tango: For replication to work, both the source and destination buckets must have versioning enabled. Not “suspended,” not “off.” Enabled. AWS will not let you create the rule otherwise, which is one of the few times it saves you from yourself here.

The presigned URL you generated for the source bucket? Worthless on the destination bucket. The replication status of an object is a property you can check (s3:ReplicationStatus) to see if it’s PENDING, COMPLETED, or FAILED. Use this for debugging, not for real-time application logic. Replication is a fantastic, resilient feature, but like a good assistant, it works best when you understand its quirks and don’t expect it to read your mind.