Python now supports checking bytecode cache up-to-dateness with a hash of the
source contents rather than volatile source metadata. See the PEP for details.
While a fairly straightforward idea, quite a lot of code had to be modified due
to the pervasiveness of pyc implementation details in the codebase. Changes in
this commit include:
- The core changes to importlib to understand how to read, validate, and
regenerate hash-based pycs.
- Support for generating hash-based pycs in py_compile and compileall.
- Modifications to our siphash implementation to support passing a custom
key. We then expose it to importlib through _imp.
- Updates to all places in the interpreter, standard library, and tests that
manually generate or parse pyc files to grok the new format.
- Support in the interpreter command line code for long options like
--check-hash-based-pycs.
- Tests and documentation for all of the above.
Previously AttributeError was raised, but that's not very reflective of the fact that the requested module can't be found since the specified parent isn't actually a package.
modules can't be lazily loaded.
Thanks to Python 3.6 allowing for types.ModuleType to have its
__class__ mutated, the restriction can be lifted by calling
create_module() on the wrapped loader.
Known limitations of the current implementation:
- documentation changes are incomplete
- there's a reference leak I haven't tracked down yet
The leak is most visible by running:
./python -m test -R3:3 test_importlib
However, you can also see it by running:
./python -X showrefcount
Importing the array or _testmultiphase modules, and
then deleting them from both sys.modules and the local
namespace shows significant increases in the total
number of active references each cycle. By contrast,
with _testcapi (which continues to use single-phase
initialisation) the global refcounts stabilise after
a couple of cycles.
The concept of .pyo files no longer exists. Now .pyc files have an
optional `opt-` tag which specifies if any extra optimizations beyond
the peepholer were applied.
importlib.abc.Loader.exec_module() is also defined.
Before this change, create_module() was optional **and** could return
None to trigger default semantics. This change now reduces the
options for choosing default semantics to one and in the most
backporting-friendly way (define create_module() to return None).
Along the way, dismantle importlib._bootstrap._SpecMethods as it was
no longer relevant and constructing the new function required
partially dismantling the class anyway.
old methods now provide implementations when PEP 451 APIs are present.
This should help with backwards-compatibility with code which has not
been updated to work with PEP 451.
importlib.machinery.FileFinder.
While originally moved to stop special-casing '' as PathFinder farther
up the typical call chain now uses the cwd in the instance of '', it
was deemed an unnecessary risk to breaking subclasses of FileFinder to
take the special-casing out.
and stop importlib.machinery.FileFinder treating '' as '.'.
Previous PathFinder transformed '' into '.' which led to __file__ for
modules imported from the cwd to always be relative paths. This meant
the values of the attribute were wrong as soon as the cwd changed.
This change now means that as long as the site module is run (which
makes all entries in sys.path absolute) then all values for __file__
will also be absolute unless it's for __main__ when specified by file
path in a relative way (modules imported by runpy will have an
absolute path).
Now that PathFinder is no longer treating '' as '.' it only makes
sense for FileFinder to stop doing so as well. Now no transformation
is performed for the directory given to the __init__ method.
Thanks to Madison May for the initial patch.
The helper function makes it easier to implement
imoprtlib.abc.InspectLoader.get_source() by making that function
require just the raw bytes for source code and handling all other
details.
importlib.abc.Loader.init_module_attrs() and implement
importlib.abc.InspectLoader.load_module().
The importlib.abc.Loader.init_module_attrs() method sets the various
attributes on the module being loaded. It is done unconditionally to
support reloading. Typically people used
importlib.util.module_for_loader, but since that's a decorator there
was no way to override it's actions, so init_module_attrs() came into
existence to allow for overriding. This is also why module_for_loader
is now pending deprecation (having its other use replaced by
importlib.util.module_to_load).
All of this allowed for importlib.abc.InspectLoader.load_module() to
be implemented. At this point you can now implement a loader with
nothing more than get_code() (which only requires get_source();
package support requires is_package()). Thanks to init_module_attrs()
the implementation of load_module() is basically a context manager
containing 2 methods calls, a call to exec(), and a return statement.