To support reproducible builds, the setting of of SOURCE_DATE_EPOCH triggers the py_compile module -- and by extension, compileall -- to forcibly compile with hash-based .pyc files. This eliminates the possibility of timestamp-based .pyc files which vary between builds.
Python now supports checking bytecode cache up-to-dateness with a hash of the
source contents rather than volatile source metadata. See the PEP for details.
While a fairly straightforward idea, quite a lot of code had to be modified due
to the pervasiveness of pyc implementation details in the codebase. Changes in
this commit include:
- The core changes to importlib to understand how to read, validate, and
regenerate hash-based pycs.
- Support for generating hash-based pycs in py_compile and compileall.
- Modifications to our siphash implementation to support passing a custom
key. We then expose it to importlib through _imp.
- Updates to all places in the interpreter, standard library, and tests that
manually generate or parse pyc files to grok the new format.
- Support in the interpreter command line code for long options like
--check-hash-based-pycs.
- Tests and documentation for all of the above.
The concept of .pyo files no longer exists. Now .pyc files have an
optional `opt-` tag which specifies if any extra optimizations beyond
the peepholer were applied.