50.2 Capturing stdout and stderr
When executing external commands, capturing their standard output (stdout) and standard error (stderr) is a fundamental requirement for programmatic interaction. The subprocess module provides several powerful and nuanced methods to achieve this, each with distinct use cases and implications.
The subprocess.run() Function and stdout/stderr Arguments
The primary method for capturing output is through the stdout and stderr arguments of subprocess.run(). These arguments accept several constants that define the handling of these streams.
subprocess.PIPE: This is the most common choice for capture. It instructs therun()function to create a pipe between your Python program and the new process. The process’s output is collected into the corresponding attribute (e.g.,result.stdout) of the returnedCompletedProcessobject.import subprocess # Capture both stdout and stderr into separate attributes result = subprocess.run(['ls', '-l', '/nonexistent'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True) # Decode bytes to string automatically print(f"Return code: {result.returncode}") print(f"STDOUT:\n{result.stdout}") # This will likely be empty for this command print(f"STDERR:\n{result.stderr}") # This will contain the 'No such file or directory' errorsubprocess.DEVNULL: A special value that tells the subprocess to immediately discard any output sent to that stream. This is useful when you want to suppress output entirely, such as for a noisy command whose output you do not care about.# Run a command and completely ignore both its output and errors result = subprocess.run(['curl', '--silent', 'https://example.com'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) print("Command executed, output discarded.")subprocess.STDOUT: A special value used only for thestderrargument. It tells the subprocess to redirect its stderr stream to its stdout stream. This allows you to capture both standard output and error messages intermixed in a single stream,result.stdout.# Redirect stderr to stdout, capturing both in a single stream result = subprocess.run(['ls', '-l', '/nonexistent', '/tmp'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, # Critical: redirects stderr to stdout text=True) print(f"Combined output:\n{result.stdout}") # The output will contain both the listing for /tmp and the error for /nonexistent
Understanding the text Argument and Encoding
A crucial and often confusing aspect is the difference between bytes and strings. By default, the PIPE captures output as raw bytes (bytes objects). This is the safest default because it preserves the exact output of the command, regardless of its encoding. The text=True argument (or universal_newlines=True in older Python versions) instructs subprocess to decode those bytes into strings using the default system encoding (which can be overridden with the encoding parameter).
Why this matters: If you try to manipulate a bytes object as a string (e.g., result.stdout.splitlines()) without setting text=True, you will get a TypeError. Always decide consciously: use text=True for text-based processing or work directly with bytes for binary data.
# Default behavior: capture as bytes
result_bytes = subprocess.run(['echo', 'hello'], stdout=subprocess.PIPE)
print(type(result_bytes.stdout)) # <class 'bytes'>
print(result_bytes.stdout) # b'hello\n'
# Using text=True: capture as string
result_str = subprocess.run(['echo', 'hello'], stdout=subprocess.PIPE, text=True)
print(type(result_str.stdout)) # <class 'str'>
print(result_str.stdout) # 'hello\n'
The Risk of Deadlocks and How to Avoid Them
A significant pitfall when using PIPE is the potential for a deadlock. This occurs when the parent process (your Python script) and the child process wait indefinitely for each other. The most common scenario is when both stdout=PIPE and stderr=PIPE are used, and the child process writes a large amount of data to stderr. The OS pipe buffers fill up, and the child process blocks, waiting for the parent to read from the stderr pipe. Meanwhile, the parent is waiting for the child to finish and close its stdout pipe before it reads the stderr data. Both processes are stuck waiting.
Solution: The subprocess.run() function with PIPE internally manages the reading of both pipes and waits for the process to terminate, effectively avoiding this deadlock for you. This is a major advantage over the older Popen interface.
However, if you are using the lower-level Popen object directly for more complex interactions, you must manage the pipes yourself to prevent deadlocks. This often involves using threads or the select module to read from stdout and stderr as data becomes available.
# This deadlock risk is handled automatically by run(), but is a real danger with Popen.
# The safe way with Popen requires more complex code.
from subprocess import Popen, PIPE
# UNSAFE with large output: can deadlock
proc = Popen(['command', 'that', 'outputs', 'a', 'lot'], stdout=PIPE, stderr=PIPE)
stdout, stderr = proc.communicate() # communicate() is the safe way to read with Popen
Best Practices and Common Pitfalls
Check the Return Code: Always check
result.returncodeafter running a command. A zero typically indicates success, while a non-zero value indicates an error. Relying solely on the presence of output instderris unreliable, as some successful commands may write warnings to stderr.Beware of Large Output: Using
PIPEcaptures the entire output of the command in memory. For commands that can produce gigabytes of output, this can exhaust your system’s memory. In such cases, consider redirecting the output directly to a file using thestdoutandstderrarguments.# Redirect output directly to files to avoid memory issues with large data with open('stdout.log', 'w') as out_file, open('stderr.log', 'w') as err_file: result = subprocess.run(['dd', 'if=/dev/zero', 'bs=1M', 'count=1000'], stdout=out_file, stderr=err_file, text=True)Use
check=Truefor Automatic Failure: If a non-zero return code should be treated as an exception, usecheck=True. This will cause aCalledProcessErrorto be raised if the command fails, which often simplifies error handling logic.try: result = subprocess.run(['false'], check=True, stdout=subprocess.PIPE, text=True) except subprocess.CalledProcessError as e: print(f"Command failed with return code {e.returncode}")