13.2 Object Keys, Metadata, Tags, and Version IDs

Right, let’s get into the guts of what you’re actually storing in an S3 bucket. It’s not just a file. It’s an object, and that object is made up of the data itself and a whole lot of descriptive baggage. Some of this baggage is incredibly useful; some of it is just there for the ride. I’ll help you tell the difference.

The Object Key is Just a Path (But Oh, What a Path)

Think of the Object Key as the full path and filename from the root of your bucket. If you upload projects/2023/q4/budget_final_v2_really_final.xlsx, that entire string is the key. This is S3’s primary mechanism for organization. There are no real folders—S3 is a flat key-value store—but the console and most tools happily use the / character to pretend there are, which is enormously helpful for our tiny human brains.

The key determines the object’s URL. For a bucket named my-awesome-data, the object above would live at https://my-awesome-data.s3.amazonaws.com/projects/2023/q4/budget_final_v2_really_final.xlsx. This has massive implications. Want to “rename” a file? You can’t. You have to copy it to a new key and delete the old one. It’s a DELETE and PUT operation, which is why it’s not atomic and can be a bit nerve-wracking for massive objects.

Here’s how you’d upload an object with a specific key using the AWS CLI. Note the key parameter.

aws s3api put-object \
    --bucket my-awesome-data \
    --key projects/2023/q4/budget.xlsx \
    --body ./local_budget.xlsx

Metadata: Headers for the Object’s Soul

When you upload a file, S3 slaps a bunch of system metadata on it, like the date it was created and its size. But the real power comes from user-defined metadata. This is a set of key-value pairs you provide, and S3 stores them alongside the object, returning them as HTTP headers whenever anyone downloads it.

This is perfect for information you need to access at the time of download. The classic example is setting the Content-Disposition header to force a download with a specific filename instead of displaying it in the browser.

aws s3api put-object \
    --bucket my-awesome-data \
    --key reports/latest.pdf \
    --body ./report.pdf \
    --metadata 'Content-Disposition=attachment; filename="Monthly_Report.pdf"'

Why not just stuff all your descriptive data in here? Because metadata is not indexable or filterable by S3 itself. You can’t ask S3, “show me all objects where the project-id metadata is project-123.” For that, you need…

Tags: For When You Actually Need to Find Things

Tags are also key-value pairs, but they exist for one purpose: governance and cost management. You can assign up to 10 tags per object, and crucially, you can use these in S3 Analytics, IAM policies, and S3 Lifecycle policies.

Want to apply a different lifecycle rule to all objects tagged environment: production vs. environment: dev? Tags. Want to restrict a user’s access to only objects tagged department: finance? Tags. This is how you make S3 smart about your data after you’ve uploaded it.

aws s3api put-object-tagging \
    --bucket my-awesome-data \
    --key projects/project-123/data.txt \
    --tagging 'TagSet=[{Key=environment, Value=production}, {Key=project-id, Value=123}]'

The key difference from metadata? Tags are for S3 and AWS services to use. Metadata is for your application and users to use when the object is accessed. Mixing them up is a common rookie mistake.

Version ID: Your Get-Out-of-Jail-Free Card

If you enable Versioning on your bucket (and you absolutely should for any non-transient bucket), every time you upload, overwrite, or delete an object, S3 doesn’t obliterate the old one. It just adds a new version and assigns it a unique, random, and gloriously opaque Version ID.

This is your single best defense against accidental deletion, ransomware, and your own stupidity at 2 AM. That budget_final_v2_really_final.xlsx I mentioned earlier? You can overwrite it ten times and still go back to the first version.

To retrieve a specific version, you need both the object key and its Version ID. Here’s how you list all versions to find the one you need:

aws s3api list-object-versions \
    --bucket my-awesome-data \
    --prefix projects/2023/q4/budget.xlsx

The output will give you an array of versions with their VersionId and IsLatest flag. To retrieve a specific one:

aws s3api get-object \
    --bucket my-awesome-data \
    --key projects/2023/q4/budget.xlsx \
    --version-id A0B1C2D3E4F5G6H7I8J9K0L1M2N3O4P \
    budget_old_version.xlsx

A critical pitfall: enabling versioning doesn’t protect you from deleting the entire bucket. A bucket delete operation will just add a DeleteMarker for every object, leaving you with a versioned but empty-seeming bucket. To truly delete everything, you have to delete every version of every object first, which is such a pain that it effectively is a safety feature. The real best practice? Use MFA Delete for your bucket, which requires a physical token to permanently delete versions. It’s a hassle, but less of a hassle than telling your CEO you lost all your data.