42.7 importlib: Dynamic Imports and Custom Importers
The importlib module, introduced in Python 3.1 and largely replacing the older imp module, is the definitive implementation of the import machinery. It provides a rich API to interact with the import system, exposing the hooks and protocols that Python itself uses. This allows developers to perform dynamic imports, introspect and manipulate the import cache, and even create entirely custom importers to load resources from non-standard locations like databases, networks, or compressed archives.
The importlib API for Dynamic Importing
While the built-in __import__() function can be used for dynamic imports, its behavior is nuanced and often non-intuitive. The importlib module provides a cleaner, more straightforward API. The import_module() function is the primary tool for this purpose. It takes a string argument for the module name and returns the module object, effectively performing the same operation as the import statement.
import importlib
# Equivalent to: import json
json_module = importlib.import_module('json')
data = json_module.loads('{"key": "value"}')
print(data) # Output: {'key': 'value'}
# Equivalent to: from urllib import request
request_module = importlib.import_module('urllib.request')
response = request_module.urlopen('http://www.python.org')
This is particularly useful when the module name is not known until runtime, such as when it’s constructed from user input or configuration files. A common pitfall is forgetting that absolute imports are required when using import_module from within a package. For a relative import, you must set the package argument to resolve the relative name correctly.
# Inside a package named 'mypackage'
# To relatively import .submodule
submodule = importlib.import_module('.submodule', package='mypackage')
Reloading Modules with importlib.reload()
The importlib.reload() function is used to reload a previously imported module. This is especially valuable in long-running interactive sessions or development servers where you need to test code changes without restarting the entire interpreter. It re-executes the module’s code, redefining its top-level names, and updates the existing module object in-place.
import mymodule
# ... later, after modifying mymodule.py ...
import importlib
mymodule = importlib.reload(mymodule)
A critical pitfall to understand is that reload() only affects the specific module passed to it. It does not recursively reload any modules that the target module itself imported. Furthermore, existing objects created from the old module’s class definitions will remain instances of the old class; they are not magically transformed into instances of the reloaded class. This can lead to subtle inconsistencies and is the primary reason reloading is generally avoided in production code.
Inspecting the Import System State
The importlib API provides functions to introspect the state of the import system. importlib.util.find_spec() is a powerful tool that returns a ModuleSpec object describing how a module would be found and loaded. This object contains metadata such as the module’s name, the loader responsible for it, its origin (often the file path), and whether it is a package.
import importlib.util
spec = importlib.util.find_spec('json')
print(spec.origin) # Output: /usr/lib/python3.9/json/__init__.py
print(spec.loader) # Output: <_frozen_importlib_external.SourceFileLoader object>
print(spec.submodule_search_locations) # Output: None, as 'json' is not a package
# Check if a module is a package
is_package = spec.submodule_search_locations is not None
This introspection is invaluable for building developer tools, debugging complex import issues, or implementing custom import logic that needs to understand the existing import landscape.
Creating Custom Importers with Meta Path Finders
The most advanced use of importlib is creating custom importers. This involves implementing the importlib.abc.MetaPathFinder and importlib.abc.Loader abstract base classes. A Meta Path Finder is registered by appending an instance of it to sys.meta_path. When an import statement is executed, Python iterates through this list, asking each finder if it can handle the requested module.
A custom finder must implement a find_spec() method that returns a ModuleSpec if it can handle the module, or None if it cannot. The ModuleSpec must specify a loader. The loader must then implement an exec_module() method, which is responsible for executing the module and populating the module’s __dict__.
The following example demonstrates a simplistic importer that loads a module from a dictionary, serving as an in-memory module store.
import sys
import types
from importlib.abc import MetaPathFinder, Loader
from importlib.util import spec_from_loader
# A simple in-memory module cache
module_cache = {
'mymemmodule': """
def hello():
return "Hello from a module in memory!"
"""
}
class MemoryLoader(Loader):
def __init__(self, code):
self.code = code
def exec_module(self, module):
# Execute the module's code in its namespace
exec(self.code, module.__dict__)
class MemoryFinder(MetaPathFinder):
def find_spec(self, fullname, path, target=None):
if fullname in module_cache:
module_code = module_cache[fullname]
loader = MemoryLoader(module_code)
# Create a module spec. The origin is set to a fictional location.
return spec_from_loader(fullname, loader, origin='<memory>')
return None # Let other finders handle it
# Register the custom finder
sys.meta_path.insert(0, MemoryFinder())
# Now we can import our module from memory
import mymemmodule
print(mymemmodule.hello()) # Output: Hello from a module in memory!
This architecture is incredibly powerful. It’s the mechanism behind major packages like pkg_resources (for loading resources from within packages) and zipimport (for importing modules from ZIP archives). Best practices for custom importers include being mindful of performance, correctly handling module reloading, and ensuring proper cleanup of any resources. They should also be designed to fail gracefully if they cannot handle a request, returning None to allow other finders in the meta path to proceed.