Uploading files and sending multipart form data are fundamental operations in web communication, often used for submitting forms that include both textual data and binary file content. The multipart/form-data encoding type, defined in RFC 7578, is designed for this exact purpose. It allows multiple pieces of data, each with its own content type and name, to be sent as a single HTTP request body, separated by a unique boundary string.

The Structure of Multipart Requests

When you send a multipart/form-data request, the body is not a simple JSON object or URL-encoded string. Instead, it is composed of multiple “parts,” each representing a field or a file. The Content-Type header of the request specifies a boundary, a unique string that does not appear in any of the part’s data. This boundary is used to demarcate the beginning and end of each part. Each part can have its own headers, like Content-Disposition (which includes the form field name and the filename) and Content-Type (e.g., image/jpeg for a JPEG file). The server uses this boundary to parse the request and extract the individual fields and files.

Basic File Upload with httpx

The httpx library simplifies creating these complex requests. You can pass a dictionary to the files parameter of the post() method, and it automatically sets the Content-Type header to multipart/form-data with the correct boundary.

import httpx

# Upload a single file
with open("report.pdf", "rb") as f:
    files = {"file": ("report.pdf", f, "application/pdf")}
    response = httpx.post("https://httpbin.org/post", files=files)

print(response.status_code)
print(response.json()["files"]["file"])

In the files dictionary, the value is a tuple. The first element is the filename the server will see, the second is the file-like object opened in binary mode ('rb'), and the optional third is the MIME type. Omitting the MIME type will cause httpx to attempt to guess it based on the filename.

Combining Files and Form Fields

A common requirement is to upload a file alongside regular text fields, such as a title or description. The files parameter can also accept a more complex dictionary structure to achieve this. You can mix file parts and field parts seamlessly.

import httpx

# Upload a file with additional form data
with open("vacation.jpg", "rb") as f:
    data = {
        "user": "johndoe",
        "comment": "My summer vacation!"
    }
    files = {
        "file": ("vacation.jpg", f, "image/jpeg"),
        "metadata": (None, '{"tags": ["beach", "sun"]}', "application/json")  # A non-file part
    }
    response = httpx.post("https://httpbin.org/post", data=data, files=files)

print(response.json()["form"])  # Shows 'user' and 'comment'
print(response.json()["files"]["file"]) # Shows the file content

It’s crucial to understand the distinction between the data and files parameters here. The data parameter is for standard form fields and will be encoded as application/x-www-form-urlencoded by default. However, when the files parameter is also present, httpx overrides the main request encoding to multipart/form-data and incorporates both the data and files values into the multipart structure. The data fields become parts with a Content-Disposition of form-data and no filename.

Advanced Configuration: Explicit Tuples

For maximum control over each part of the multipart request, you can use the explicit tuple format. This is essential for setting custom headers for a specific part or for sending a file without a filename, which some APIs require.

import httpx

with open("data.bin", "rb") as f:
    files = {
        "document": ("my_file.bin", f, "application/octet-stream", {"Expires": "0"})
    }
    response = httpx.post("https://httpbin.org/post", files=files)

The four-element tuple allows you to specify the filename, the file object, the content type, and a dictionary of optional headers for that specific part.

Common Pitfalls and Best Practices

  1. Always Open Files in Binary Mode ('rb'): This is non-negotiable. Opening a file in text mode ('r') can lead to data corruption due to platform-specific line ending conversions or encoding issues. The bytes of the file must be transmitted unchanged.
  2. Context Managers and File Handles: Using a with block (a context manager) to open the file, as shown in the examples, is a best practice. It ensures the file is properly closed after the request is made, even if an error occurs during the upload. httpx will read the file content within the request call.
  3. Memory for Large Files: For uploading very large files (e.g., several gigabytes), the default behavior of reading the entire file into memory can be problematic. To stream the upload and avoid high memory usage, you can pass a file object that supports streaming, like one returned by open(), and httpx will handle it efficiently.
  4. Server-Side Expectations: Always consult the API documentation. Some servers may expect the file part to have a specific name (e.g., "file" vs. "upload"). Others might require additional non-file fields to be sent in a particular way. Incorrectly naming the parts is a frequent source of 400-level errors.
  5. MIME Type Guessing: While convenient, automatic MIME type guessing is not foolproof. For critical applications, especially when the file extension is non-standard, explicitly providing the correct MIME type is more reliable.