57.6 httpx: Async-Capable HTTP Client
The httpx library is a modern, feature-rich HTTP client for Python that supports both synchronous and asynchronous operations. It was designed to address limitations in the popular requests library, most notably the lack of native async/await support. Built to be a next-generation client, it provides a clean, intuitive API that will feel familiar to requests users while offering significant performance benefits for I/O-bound applications through its async capabilities. Its design philosophy centers on being versatile, supporting HTTP/1.1 and HTTP/2, and providing comprehensive features like connection pooling, SSL verification, proxies, cookies, and streaming.
Synchronous vs. Asynchronous Usage
The core strength of httpx is its dual nature. For simple scripts or applications where concurrency is managed by other means (e.g., threading), the synchronous API is perfectly suitable and mirrors the requests API almost exactly. You import and use it directly.
import httpx
# Synchronous request
response = httpx.get('https://httpbin.org/json')
print(response.status_code)
print(response.json())
However, for high-performance applications that make many HTTP calls, the asynchronous API is transformative. It allows you to efficiently perform many network operations concurrently within a single thread by leveraging Python’s asyncio library. Instead of blocking the entire program while waiting for a server response, the event loop can pause the async function and work on other tasks. To use it, you must operate within an async context, using async with to manage the client’s lifecycle.
import httpx
import asyncio
async def main():
async with httpx.AsyncClient() as client:
response = await client.get('https://httpbin.org/json')
print(response.status_code)
print(response.json())
asyncio.run(main())
The AsyncClient is the gateway to non-blocking requests. The await keyword is crucial; it yields control back to the event loop, allowing other coroutines to run while the network I/O completes in the background. Forgetting to await the call is a common mistake that results in a coroutine object instead of a response.
The Client Object and Connection Pooling
While you can use one-off functions like httpx.get(), creating a Client or AsyncClient instance is a critical best practice for any non-trivial application. The primary reason is connection pooling. When you create a client, it maintains a pool of reusable connections to hosts. This eliminates the overhead of establishing a new TCP (and potentially TLS) connection for every single request, dramatically improving performance.
# Inefficient: New connection for each request
for i in range(10):
httpx.get('https://example.com/api/items') # 10 separate connections
# Efficient: Reuses connections from the pool
with httpx.Client() as client: # Context manager ensures proper cleanup
for i in range(10):
client.get('https://example.com/api/items') # Likely 1 connection
The context manager (async with for AsyncClient) is highly recommended as it ensures connections are closed gracefully when you’re done with the client. Failing to properly close the client can lead to resource warnings or unclosed connections.
Handling Timeouts and Exceptions
Robust applications must handle network uncertainty. httpx provides a structured way to manage timeouts through the Timeout object, allowing you to set separate limits for connection, read, write, and pool phases. Relying on the default timeout is risky; it’s always better to be explicit.
import httpx
# Configure a timeout: 5.0 seconds to connect, 30.0 seconds to read data
timeout = httpx.Timeout(connect=5.0, read=30.0)
try:
# Apply the timeout configuration
response = httpx.get('https://httpbin.org/delay/2', timeout=timeout)
response.raise_for_status() # Raises an exception for 4xx/5xx responses
print("Success!")
except httpx.ConnectTimeout:
print("The request timed out while trying to connect to the server.")
except httpx.ReadTimeout:
print("The server did not send any data in the allotted time.")
except httpx.HTTPStatusError as e:
print(f"Server returned an error status code: {e.response.status_code}")
except httpx.RequestError as e:
print(f"An unexpected error occurred: {e}")
The raise_for_status() method is a convenient way to trigger an exception for bad status codes (400 and above), forcing you to handle these errors explicitly rather than letting them pass silently.
Advanced Configuration: Proxies, Authentication, and HTTP/2
httpx excels at complex configurations. It supports various proxy schemes (http://, socks5://), multiple authentication methods (Basic, Digest), and can negotiate HTTP/2 automatically if the http2 package is installed.
import httpx
# Configure a client with a proxy and basic auth
proxies = {
"http://": "http://proxy.example.com:8080",
"https://": "http://proxy.example.com:8080",
}
auth = ("my_username", "my_password")
with httpx.Client(proxies=proxies, auth=auth, http2=True) as client:
response = client.get('https://httpbin.org/basic-auth/my_username/my_password')
print(response.json())
A common pitfall with proxies is misconfiguring the scheme key. The dictionary keys should match the scheme of the target URL. Another advanced feature is the ability to mount custom transports, allowing for even lower-level control over the network operations, which can be useful for mocking or testing.
Streaming Requests and Responses
For large file downloads or uploads, or for processing data as it arrives, streaming is essential to prevent loading the entire request or response body into memory at once.
import httpx
# Stream a download
with httpx.Client() as client:
with client.stream('GET', 'https://httpbin.org/stream-bytes/1024') as response:
for chunk in response.iter_bytes():
print(f"Received chunk of {len(chunk)} bytes.")
# Process the chunk incrementally
# Stream an upload
def generate_file_data():
yield b"first chunk of data\n"
yield b"second chunk of data\n"
with httpx.Client() as client:
response = client.post('https://httpbin.org/post', data=generate_file_data())
The stream method, used as a context manager, is the key here. It sends the request immediately but does not fetch the response body until you iterate over it. This allows you to process data in chunks and break early if needed, saving significant memory for large payloads.