50.1 subprocess.run(): The Modern API

The subprocess.run() function, introduced in Python 3.5, represents the modern and recommended high-level API for spawning subprocesses. It consolidates the functionality of the older call, check_call, check_output, and Popen workflows into a single, more intuitive interface. Its primary advantage is that it handles the entire lifecycle of the process—initiation, waiting for completion, and collecting output—in one call, reducing boilerplate code and the potential for errors.

Basic Usage and Return Value

At its simplest, subprocess.run() executes the provided command and returns a CompletedProcess instance. This object contains vital information about the finished process, including the return code (.returncode), any captured standard output (.stdout), and standard error (.stderr).

import subprocess

result = subprocess.run(['ls', '-l'])
print(f"Return code: {result.returncode}")

This runs the ls -l command. By default, the subprocess’s output is sent directly to the terminal, not captured by the Python script. The CompletedProcess object will have stdout and stderr set to None.

Capturing Output

To capture the output of the command for programmatic use within your Python script, you must use the capture_output=True parameter. This redirects the subprocess’s stdout and stderr to pipes, allowing run() to collect the data. The text is returned as a sequence of bytes by default.

result = subprocess.run(['ls', '-l'], capture_output=True)
print(f"STDOUT (as bytes):\n{result.stdout}")
print(f"STDERR (as bytes):\n{result.stderr}")

To automatically decode the output into strings using the default system encoding (often UTF-8, but can vary), use text=True. This is highly recommended for most text-processing use cases to avoid manual decoding steps.

result = subprocess.run(['ls', '-l'], capture_output=True, text=True)
print(f"STDOUT (as string):\n{result.stdout}")

Handling Errors and Checking Return Codes

A non-zero exit code typically indicates that the command failed. The run() function can automatically check this and raise a CalledProcessError exception if the command fails, which often simplifies error handling. This is controlled with the check=True parameter.

try:
    # This will fail because '/nonexistent/dir' likely doesn't exist
    result = subprocess.run(['ls', '/nonexistent/dir'], capture_output=True, text=True, check=True)
except subprocess.CalledProcessError as e:
    print(f"Command failed with return code {e.returncode}.")
    print(f"Error output: {e.stderr}")

Without check=True, it is the programmer’s responsibility to inspect result.returncode after the call. The check=True behavior mirrors the old subprocess.check_call().

Input to the Subprocess (Feeding stdin)

To send data to the standard input of the subprocess, use the input parameter. When used, you must also set capture_output=True (or manually set stdin=subprocess.PIPE). As with output, if text=True is set, the input can be a string; otherwise, it must be bytes.

# Using a string with text=True
input_data = "Alice\nBob\nCharlie"
result = subprocess.run(['grep', 'li'], input=input_data, capture_output=True, text=True)
print(result.stdout)  # Outputs: Alice

# Using bytes without text=True
input_data_bytes = b"Alice\nBob\nCharlie"
result = subprocess.run(['grep', b'li'], input=input_data_bytes, capture_output=True)
print(result.stdout.decode())  # Manual decoding required

Timeout and Process Control

The timeout parameter is a critical safety feature. It specifies a maximum number of seconds for the subprocess to complete. If the process exceeds this time, a TimeoutExpired exception is raised, and the subprocess is terminated to prevent your application from hanging indefinitely.

try:
    # This command will sleep for 10 seconds
    result = subprocess.run(['sleep', '10'], timeout=2.5)
except subprocess.TimeoutExpired:
    print("The subprocess was terminated for taking too long.")

It’s important to note that on timeout, the child process is not automatically killed on all platforms immediately; run() will attempt a graceful SIGTERM followed by a forceful SIGKILL if necessary.

Security Consideration: shell=True

A common pitfall is using shell=True with untrusted input, as it introduces severe security risks. When shell=True is used, the first argument is passed as a single string to the system’s shell (e.g., /bin/sh on Linux), which parses it and executes any shell commands within it.

# UNSAFE with user input
user_filename = "myfile.txt; rm -rf /"
subprocess.run(f"cat {user_filename}", shell=True)  # This is dangerously vulnerable!

# SAFER alternative without shell=True
user_filename = "myfile.txt; rm -rf /"
subprocess.run(['cat', user_filename])  # This will try to cat a file with a semicolon in its name.

The safe approach is to avoid shell=True whenever possible. If you must use it, never construct the command string from unvalidated user input. The only safe case for shell=True is with a fixed, hard-coded string. For most use cases, passing a list of arguments is the correct and secure method, as it avoids the shell’s command parsing entirely.