56.7 ftplib: FTP File Transfer
The ftplib module in Python provides a client-side implementation of the File Transfer Protocol (FTP), enabling scripts to interact with FTP servers for a wide range of operations, from anonymous public downloads to secure automated uploads. It abstracts the complexities of the FTP command and response protocol, offering both a high-level interface for common tasks and a low-level interface for fine-grained control. Understanding its operation requires a grasp of FTP’s two-channel nature: a control connection (port 21) for commands and a separate data connection established on-demand for transferring file listings and contents.
Establishing a Connection and Basic Commands
The core of ftplib is the FTP class. A connection is initiated by creating an instance, often by providing the hostname. For login, the login() method is used, which defaults to anonymous credentials if none are provided. It’s crucial to understand that the initial connection is only for control; the data connection is negotiated separately, a process handled automatically by the library but influenced by the connection mode.
from ftplib import FTP
import sys
try:
# Connect to a host
with FTP('ftp.example.com') as ftp:
# Login with username and password
ftp.login('username', 'password123')
print("Login successful.")
# Get the welcome message from the server
print(f"Server welcome: {ftp.getwelcome()}")
# Print the current working directory
print(f"Current directory: {ftp.pwd()}")
# Change to a specific directory
ftp.cwd('/pub/incoming')
print(f"New directory: {ftp.pwd()}")
except Exception as e:
print(f"FTP error: {e}")
sys.exit(1)
Active vs. Passive Mode
This is a critical concept and a common source of connection failures, especially with modern firewalls and NATs. In Active mode, the client opens a socket and tells the server its IP and port. The server then connects from its port 20 to the client’s specified port. This often fails because client-side firewalls block the incoming server connection.
In Passive mode (the default and recommended setting in ftplib), the client requests the server to open a data port and inform the client of its IP and port. The client then connects to that server port. This usually works because it only involves outbound connections from the client. You can explicitly set this with ftp.set_pasv(True).
Listing Directory Contents and Retrieving Files
The nlst() and dir() methods are used to obtain directory listings. nlst() returns a simple list of filenames, while dir() executes the LIST command, returning a detailed listing (like ls -l) which you must parse yourself. For file transfers, retrbinary() is used for non-text files (images, archives) and retrlines() for text files, though retrbinary() is generally safer and faster for all types.
with FTP('ftp.example.com') as ftp:
ftp.login('user', 'pass')
ftp.cwd('/pub/mirrors')
# Get a simple list of files and directories
file_list = ftp.nlst()
print("Files in directory:", file_list)
# Download a binary file (e.g., a ZIP archive)
with open('local_copy.zip', 'wb') as local_file:
# The 'RETR filename' command is sent, and data is written to local_file
ftp.retrbinary(f'RETR {file_list[0]}', local_file.write)
# Download and print a text file line by line
ftp.retrlines('RETR README.txt', print)
Uploading Files and Other Operations
The counterparts for upload are storbinary() for binary files and storlines() for text files. Similar to retrieval, storbinary() is the preferred method. Other common operations include creating directories (mkd()), deleting files (delete()), and renaming files (rename()).
with FTP('ftp.example.com') as ftp:
ftp.login('user', 'pass')
ftp.cwd('/incoming')
# Create a new directory on the server
ftp.mkd('uploaded_data')
# Upload a binary file
with open('report.pdf', 'rb') as local_file:
ftp.storbinary('STOR report_backup.pdf', local_file)
# Upload a text file using storlines (less common)
with open('notes.txt', 'rb') as local_file:
ftp.storlines('STOR notes.txt', local_file)
Error Handling and Best Practices
FTP is a legacy protocol and can be brittle. Robust code must anticipate and handle exceptions. ftplib raises exceptions like error_perm for permanent errors (e.g., permission denied, file not found) and error_temp for temporary errors (e.g., connection timeout).
Always use a context manager (with statement) or a try/finally block to ensure the connection is properly closed, even if an error occurs. This is not just about cleanliness; it signals the server to terminate the connection gracefully.
from ftplib import FTP, error_perm
try:
with FTP('ftp.example.com', timeout=30) as ftp:
ftp.login()
ftp.cwd('/non/existent/dir') # This will likely cause an error
except error_perm as e:
print(f"Permanent FTP Error: {e}")
except Exception as e:
print(f"Other error: {e}")
Security Limitations and Modern Alternatives
A major pitfall of standard FTP is that it transmits all data, including passwords, in plaintext. This is a significant security risk. For any sensitive transfer, FTPS (FTP over SSL/TLS) should be used. ftplib supports this via the FTP_TLS class. You connect and login, and then explicitly secure the data channel with prot_p().
from ftplib import FTP_TLS
ftps = FTP_TLS('ftp.secure.example.com')
ftps.login('user', 'pass')
ftps.prot_p() # Secures the data connection with SSL/TLS
# ... proceed with secure transfers ...
It is also considered a best practice to avoid ftplib for new projects where possible, opting for more modern and secure protocols like SFTP (using the paramiko library) or transfers over HTTPS/SSH. FTP’s design, including its separate data channel and lack of inherent encryption, makes it difficult to secure and operate through modern network infrastructure.