84.7 boto3: S3, DynamoDB, SQS, and EC2 from Python
Alright, let’s get our hands dirty. You’ve written some Python, and now you need it to talk to the sprawling, slightly chaotic metropolis that is AWS. Enter boto3. This isn’t some abstract library; it’s your direct line to the cloud control panel. Think of it as the Pythonic API for AWS—because typing aws cli commands into a shell script is so 2012.
First, the non-negotiable setup. You need credentials. Boto3 looks for them in this order:
- Passed directly to the
boto3.client()function (great for temporary stuff, bad for hardcoding). - Environment variables (
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY). - A shared credentials file (usually
~/.aws/credentials). - An IAM role (if you’re running this on an EC2 instance, Lambda, etc.).
The IAM role method is the professional’s choice for production. It means no secrets are stored on the machine; AWS just handles it. Magic. But for local dev, the credentials file is your best bet. Run aws configure if you have the AWS CLI installed, and it’ll set this up for you. If not, why don’t you have it installed? Go do that. I’ll wait.
The Two Personalities: Client vs. Resource
Boto3 has a bit of a split personality, and it’s crucial you understand the difference. You can interact with services either as a low-level Client or a higher-level Resource.
The Client offers a one-to-one mapping with the actual AWS service API. Every single operation, every parameter—it’s all there. It’s verbose, but it’s complete and always up-to-date. This is what you use when you need explicit control or when a new AWS feature has just dropped and the Resource layer hasn’t caught up yet.
The Resource is a more Pythonic, object-oriented abstraction. Instead of getting back a dictionary full of nested keys you have to parse, you get back actual Python objects with attributes and methods. It’s cleaner and more intuitive, but it might not support every single esoteric API call.
My rule of thumb? Start with Resource for its elegance. Drop down to Client when Resource is being difficult or missing a feature. Let’s see them in action.
Talking to S3: Your Cloud Attic
S3 is the internet’s junk drawer, and we love it for that. Here’s how you upload a file using both methods.
import boto3
# The Client way (low-level)
s3_client = boto3.client('s3')
s3_client.upload_file(
Filename='/path/to/my/local/file.txt',
Bucket='my-super-cool-bucket',
Key='path/in/s3/file.txt' # This is the object's name
)
# The Resource way (high-level)
s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket('my-super-cool-bucket')
bucket.upload_file('/path/to/local/file.txt', 'path/in/s3/file.txt')
See how the Resource feels more natural? You’re dealing with a Bucket object, not a generic client. Now, downloading is just as easy. But here’s the first “questionable choice” I’ll call out: the download_file method. It’s fantastic… until it silently fails. It won’t raise an exception if the file doesn’t exist; it’ll just… not download anything. Always check the returned file exists if your logic depends on it.
DynamoDB: NoSQL for When You’re Over Schemas
DynamoDB is a beast of a different color. It’s a key-value store that can scale to infinity, but you have to design your tables correctly. Forget everything you know about JOINs. Here’s how you put an item.
import boto3
from decimal import Decimal
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('MyTable')
# Putting an item
response = table.put_item(
Item={
'UserId': 'A123', # Your partition key (HASH)
'OrderId': 456, # Your sort key (RANGE), if you have one
'FirstName': 'Jane',
'LastName': 'Doe',
'AccountBalance': Decimal('123.45'), # Crucial! Never use float for money.
'Tags': ['customer', 'active'] # Lists are fine too.
}
)
# Getting it back
response = table.get_item(
Key={
'UserId': 'A123',
'OrderId': 456
}
)
item = response.get('Item') # This will be None if the item wasn't found
print(item)
The biggest pitfall here? Data types. AWS would rather you use their weird Decimal type for numbers to avoid floating-point precision issues. It’s annoying, but correct. Also, notice the .get('Item'). A get_item request for a non-existent key doesn’t throw an error; it just returns an empty Item key. Your code needs to handle that.
SQS: The Message Bus That Tries Its Best
SQS is the reliable, if sometimes slow, courier of the cloud. You hand it a message, and it eventually delivers it. Here’s how you send a message to a queue.
sqs = boto3.resource('sqs')
queue = sqs.get_queue_by_name(QueueName='my-queue')
# Send a message
response = queue.send_message(MessageBody='Hello from boto3!')
# The response contains a MD5 of the body and a MessageId
# Send a message with attributes (e.g., for message filtering)
response = queue.send_message(
MessageBody='Order processed',
MessageAttributes={
'OrderType': {
'DataType': 'String',
'StringValue': 'Electronics'
}
}
)
Receiving messages is where it gets interesting. You don’t “get” a message; you “lease” it. When you receive a message, it becomes invisible to other consumers for a period of time. You must delete it within that window after you’re done processing it. If you don’t, it’ll reappear on the queue for someone else to handle. This is the “at-least-once” delivery guarantee.
messages = queue.receive_messages(MaxNumberOfMessages=10, WaitTimeSeconds=5)
for message in messages:
print(f"Processing: {message.body}")
# ... do your work ...
message.delete() # Critical! This confirms processing is done.
The landmine here is forgetting to call message.delete(). Your message will vanish for a bit, then pop back up, leading to duplicated work and you tearing your hair out wondering why your job is running twice.
EC2: The Foundation of It All
Sometimes you just need to spin up a dang server. Boto3 makes this almost too easy.
ec2 = boto3.resource('ec2')
# The most basic launch
instances = ec2.create_instances(
ImageId='ami-12345abcde', # This is specific to the region and OS!
MinCount=1,
MaxCount=1,
InstanceType='t3.micro',
KeyName='my-ssh-key-pair-name' # You need this to SSH in
)
new_instance = instances[0]
print(f"Instance ID: {new_instance.id} is now {new_instance.state['Name']}")
The “why” behind the MinCount/MaxCount parameters is a relic of the old days where you could request a bunch of identical instances at once. Now, it’s mostly just a formality, but you still have to set it.
The most common mistake? Using a public AMI from a different region. That ami-12345abcde in us-east-1 is a completely different image in us-west-2. Always look up the correct AMI for your region and operating system.
Remember, with great power comes great responsibility—and a surprisingly large AWS bill if you forget to terminate instances. Always, always have a plan for cleaning up your test resources. Now go build something.