57.5 Streaming Large Responses

When dealing with large HTTP responses—such as multi-gigabyte files, extensive log dumps, or endless streaming data feeds—downloading the entire content into memory before processing it is often impractical and can lead to excessive memory consumption, application instability, or even crashes. The solution to this problem is to stream the response content, processing it in smaller, manageable chunks as it is received from the network, rather than waiting for the complete payload.

The Core Concept of Response Streaming

Streaming fundamentally changes how your application interacts with the response body. Instead of being a monolithic blob of data that you access only after it has been completely transferred, the body becomes an iterable sequence of bytes objects. The httpx library provides this functionality through the stream context manager and by setting stream=True on the request. Inside this context, the response content is not automatically read and held in memory. Instead, you manually iterate over the iter_bytes() method (or similar iterators) of the response object. This allows your code to process each chunk of data the moment it is received over the network, dramatically reducing the memory footprint for large transfers.

import httpx
import hashlib

# Create a SHA-256 hash object to update incrementally
file_hash = hashlib.sha256()

with httpx.stream('GET', 'https://example.com/large_file.zip') as response:
    # Check status before starting to read the stream
    response.raise_for_status()
    
    print(f"Headers received: {response.headers}")
    
    # Iterate over the response data in chunks (default size is 1KB)
    for chunk in response.iter_bytes():
        # Update the hash with the incoming chunk
        file_hash.update(chunk)
        # Here you could also write the chunk to a file
        # file_handle.write(chunk)

print(f"SHA-256 hash of the file is: {file_hash.hexdigest()}")
# The entire response body was never stored in memory at once.

Controlling Chunk Size and Using Different Iterators

The iter_bytes() method allows you to specify a chunk_size parameter, giving you fine-grained control over the maximum size of each byte chunk yielded during iteration. This is crucial for balancing between memory usage and processing efficiency. Larger chunks mean fewer iterations but higher memory usage per chunk. httpx also provides other iterators for convenience: iter_text() for decoding chunks to text on the fly (useful for line-based streaming) and iter_lines() which specifically yields decoded text lines. It’s important to note that using iter_text() or iter_lines() requires careful consideration of encoding, which can be set manually if the server’s Content-Type header is incorrect or missing.

with httpx.stream('GET', 'https://example.com/server_logs.txt') as response:
    response.raise_for_status()
    
    # Manually set encoding if the server reports it incorrectly
    response.encoding = 'utf-8'
    
    # Process the response line by line as it streams
    for line in response.iter_lines():
        if 'ERROR' in line:
            print(f"Found error in log: {line}")

Essential Best Practices and Common Pitfalls

Always Use a Context Manager (with): The httpx.stream() context manager is not optional for streaming. It ensures that the HTTP connection is properly closed and returned to the connection pool, even if an error occurs during the streaming process. Failing to use it can lead to connection leaks.

Check the Status Code Early: Once you enter the with block, the HTTP request has been made and the headers have been received, but the body has not. This is the perfect time to check response.raise_for_status() or inspect response.status_code. If the request failed (e.g., 404 Not Found), you can exit the context manager without wasting bandwidth and time downloading an error message body.

Beware of Gzip Compression: A major pitfall occurs when a server automatically applies gzip or deflate compression. In standard mode, httpx automatically decompresses the body. However, in streaming mode, it does not. You will receive raw compressed bytes. If you need the decompressed content, you must either disable compression on the request (headers={'Accept-Encoding': 'identity'}) or manually decompress the streamed chunks using the gzip module, which is non-trivial for a continuous stream.

Handy for Large Downloads: The primary use case is writing large files to disk without consuming memory.

with httpx.stream('GET', 'https://example.com/large_video.mp4') as response:
    response.raise_for_status()
    with open('large_video.mp4', 'wb') as download_file:
        for chunk in response.iter_bytes(chunk_size=16384): # 16KB chunks
            download_file.write(chunk)

Timeouts and Streaming: The timeout parameter behaves differently during streaming. The timeout value applies to each individual read operation on the network socket, not to the entire download. Therefore, a timeout=30.0 setting means the connection can stay open indefinitely as long as it receives at least one chunk of data every 30 seconds. To impose a total timeout on the entire operation, you would need to implement your own timing logic.