The shutil module, short for “shell utilities,” is an indispensable part of Python’s standard library for high-level file operations. While pathlib and os excel at path manipulation and low-level system calls, shutil provides a suite of functions designed for everyday tasks like copying files, moving directories, and recursively deleting entire folder structures. It abstracts away the complexities and platform-specific nuances of these operations, offering a clean, Pythonic interface.

Copying Files and Metadata

The most basic operation is copying a single file. shutil.copy2(src, dst) is the preferred function for this task. It copies the file’s content and its metadata, including timestamps (creation and modification) and permissions. This is in contrast to shutil.copy(src, dst), which only copies the data and the file’s permission mode. The dst can be a target directory or a full path to a new filename.

import shutil
from pathlib import Path

source_file = Path('data.txt')
dest_dir = Path('/backups/')

# Copy to a directory (keeps original filename 'data.txt')
shutil.copy2(source_file, dest_dir)

# Copy and rename the file in the destination
shutil.copy2(source_file, dest_dir / 'data_backup.txt')

# Demonstrating the difference between copy and copy2
import os
shutil.copy(source_file, 'copy.txt')  # Copies data and permissions
shutil.copy2(source_file, 'copy2.txt') # Copies data, permissions, *and* timestamps

print("Original modified time:", os.path.getmtime('data.txt'))
print("copy() modified time:  ", os.path.getmtime('copy.txt'))  # Time of copy operation
print("copy2() modified time: ", os.path.getmtime('copy2.txt')) # Matches original

Copying Directories Recursively

To duplicate an entire directory tree, you use shutil.copytree(src, dst). This function creates a new destination directory and recursively copies every file and subdirectory from the source into it. A crucial feature is its dirs_exist_ok parameter (introduced in Python 3.8). By default, copytree requires the destination to not exist, preventing accidental overwrites. Setting dirs_exist_ok=True allows the function to merge the contents of the source into an existing destination directory.

import shutil

# This will fail if '/backups/my_project' already exists
try:
    shutil.copytree('./my_project', '/backups/my_project')
except FileExistsError as e:
    print(f"Error: {e}")

# The safe, modern way: allow merging with an existing directory
shutil.copytree('./my_project', '/backups/my_project', dirs_exist_ok=True)

# A useful pattern is to combine it with pathlib for path resolution
source_dir = Path('./reports/q3')
dest_dir = Path('/archive/2023/reports')
dest_dir.mkdir(parents=True, exist_ok=True)  # Ensure the parent exists
shutil.copytree(source_dir, dest_dir / 'q3', dirs_exist_ok=True)

Moving Files and Directories

The shutil.move(src, dst) function is used to relocate files and directories. Its behavior is analogous to the Unix mv command. If the destination is a directory, the source is moved into it. If the destination is a file path or doesn’t exist, the source is effectively renamed. Crucially, if the destination is on the same filesystem, move will use an atomic os.rename() operation, which is very fast. If the destination is on a different filesystem, shutil must fall back to a slower copy-and-delete process.

# Move a file into a directory
shutil.move('report.pdf', '/archive/2023/')

# Rename a file (moving it to a new name in the same directory)
shutil.move('old_name.txt', 'new_name.txt')

# Move and rename a file in one operation
shutil.move('document.txt', '/archive/2023/renamed_doc.txt')

Deleting Entire Directory Trees

The most dangerous and powerful function in shutil is shutil.rmtree(path). It deletes a directory and all of its contents, recursively. There is no built-in trash/recycle bin functionality; this operation is permanent and immediate. This is why it is one of the most common sources of catastrophic bugs. Always double and triple-check the path you are passing to this function. A best practice is to use pathlib.Path objects to construct paths safely and to print the path for visual confirmation before execution, especially in scripts.

import shutil
from pathlib import Path

# DANGER: Permanently deletes '/tmp/build_cache' and everything inside it.
cache_dir = Path('/tmp/build_cache')

# BEST PRACTICE: Add a safety check before calling rmtree.
if cache_dir.exists() and cache_dir.is_dir():
    print(f"About to permanently delete: {cache_dir.resolve()}")
    confirm = input("Type 'YES' to confirm: ")
    if confirm == 'YES':
        shutil.rmtree(cache_dir)
        print("Directory deleted.")
    else:
        print("Deletion cancelled.")
else:
    print("Directory does not exist.")

Common Pitfalls and Best Practices

  1. Race Conditions: Be aware that between checking for a path’s existence (with .exists()) and performing an operation on it, another process could delete or create it. Handle exceptions (FileNotFoundError, PermissionError) instead of relying solely on checks.
  2. Symbolic Links: By default, copytree will follow symbolic links and copy the files they point to. This can lead to unintended data duplication or copying massive directory structures. Use the symlinks=True parameter to preserve the links instead.
  3. Metadata and Permissions: If preserving all metadata is critical, use copy2 and copytree(copy_function=shutil.copy2). Be mindful that copying files to a different filesystem might not preserve all metadata (like creation time on some systems) due to OS limitations.
  4. Error Handling: File operations can fail for many reasons: missing files, permission issues, or full disks. Always wrap shutil operations in try...except blocks to handle these scenarios gracefully rather than letting your script crash.
  5. The Nuclear Option (rmtree): As a final safety measure, consider using a dedicated library like send2trash (which requires installation) for operations that should be reversible, especially in user-facing applications.