This effectively reverts the Makefile change in gh-31637. I've added some notes so it is more clear what is going on.
We also update the "Check if generated files are up to date" job to run "make regen-deepfreeze" to ensure "make regen-global-objects" catches deepfreeze.c.
https://bugs.python.org/issue47146
We have to run "make regen-deepfreeze" before running Tools/scripts/generate-global-objects.py; otherwise we will miss any changes to global objects in deep-frozen modules (which aren't committed in the repo). However, building $(PYTHON_FOR_FREEZE) fails if one of its source files had a global object (e.g. via _Py_ID(...)) added or removed, without generate-global-objects.py running first. So "make regen-global-objects" would sometimes fail.
We solve this by running generate-global-objects.py before *and* after "make regen-deepfreeze". To speed things up and cut down on noise, we also avoid updating the global objects files if there are no changes to them.
https://bugs.python.org/issue46712
* Moves the bytecode to the end of the corresponding PyCodeObject, and quickens it in-place.
* Removes the almost-always-unused co_varnames, co_freevars, and co_cellvars member caches
* _PyOpcode_Deopt is a new mapping from all opcodes to their un-quickened forms.
* _PyOpcode_InlineCacheEntries is renamed to _PyOpcode_Caches
* _Py_IncrementCountAndMaybeQuicken is renamed to _PyCode_Warmup
* _Py_Quicken is renamed to _PyCode_Quicken
* _co_quickened is renamed to _co_code_adaptive (and is now a read-only memoryview).
* Do not emit unused nonzero opargs anymore in the compiler.
Instead of manually enumerating the global strings in generate_global_objects.py, we extrapolate the list from usage of _Py_ID() and _Py_STR() in the source files.
This is partly inspired by gh-31261.
https://bugs.python.org/issue46541
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
This reduces the size of the data segment by **300 KB** of the executable because if the modules are deep-frozen then the marshalled frozen data just wastes space. This was inspired by comment by @gvanrossum in https://github.com/python/cpython/pull/29118#issuecomment-958521863. Note: There is a new option `--deepfreeze-only` in `freeze_modules.py` to change this behavior, it is on be default to save disk space.
```console
# du -s ./python before
27892 ./python
# du -s ./python after
27524 ./python
```
Automerge-Triggered-By: GH:ericsnowcurrently
This change is a prerequisite for generating code for other global objects (like strings in gh-30928).
(We borrowed some code from Tools/scripts/deepfreeze.py.)
https://bugs.python.org/issue46541
The build system now uses a :program:`_bootstrap_python` interpreter for
freezing and deepfreezing again. To speed up build process the build tools
:program:`_bootstrap_python` and :program:`_freeze_module` are no longer
build with LTO.
Cross building depends on a build Python interpreter, which must have same
version and bytecode as target host Python.
Instead we use $(PYTHON_FOR_REGEN) .../deepfreeze.py with the
frozen .h file as input, as we did for Windows in bpo-45850.
We also get rid of the code that generates the .h files
when make regen-frozen is run (i.e., .../make_frozen.py),
and the MANIFEST file.
Restore Python 3.8 and 3.9 as Windows host Python again
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Implement changes to build with deep-frozen modules on Windows.
Note that we now require Python 3.10 as the "bootstrap" or "host" Python.
This causes a modest startup speed (around 7%) on Windows.
This gains 10% or more in startup time for `python -c pass` on UNIX-ish systems.
The Makefile.pre.in generating code builds on Eric's work for bpo-45020, but the .c file generator is new.
Windows version TBD.
Currently custom modules (the array set on PyImport_FrozenModules) replace all the frozen stdlib modules. That can be problematic and is unlikely to be what the user wants. This change treats the custom frozen modules as additions instead. They take precedence over all other frozen modules except for those needed to bootstrap the import system. If the "code" field of an entry in the custom array is NULL then that frozen module is treated as disabled, which allows a custom entry to disable a frozen stdlib module.
This change allows us to get rid of is_essential_frozen_module() and simplifies the logic for which frozen modules should be ignored.
https://bugs.python.org/issue45395
This is a cross-platform check that the symbols are actually
exported in the ABI, not e.g. hidden in a macro.
Caveat: PyModule_Create2 & PyModule_FromDefAndSpec2 are skipped.
These aren't exported on some of our buildbots. This is a bug
(bpo-44133). This test now makes sure all the others don't regress.
Like #28744 but for the Tools directory.
[skip issue] Opening a related issue is pending python/psf-infra-meta#130
Automerge-Triggered-By: GH:pablogsal
In the list of generated frozen modules at the top of Tools/scripts/freeze_modules.py, you will find that some of the modules have a different name than the module (or .py file) that is actually frozen. Let's call each case an "alias". Aliases do not come into play until we get to the (generated) list of modules in Python/frozen.c. (The tool for freezing modules, Programs/_freeze_module, is only concerned with the source file, not the module it will be used for.)
Knowledge of which frozen modules are aliases (and the identity of the original module) normally isn't important. However, this information is valuable when we go to set __file__ on frozen stdlib modules. This change updates Tools/scripts/freeze_modules.py to map aliases to the original module name (or None if not a stdlib module) in Python/frozen.c. We also add a helper function in Python/import.c to look up a frozen module's alias and add the result of that function to the frozen info returned from find_frozen().
https://bugs.python.org/issue45020
I've added a number of test-only modules. Some of those cases are covered by the recently frozen stdlib modules (and some will be once we add encodings back in). However, I figured we'd play it safe by having a set of modules guaranteed to be there during tests.
https://bugs.python.org/issue45020
Currently we're freezing the __init__.py twice, duplicating the built data unnecessarily With this change we do it once. There is no change in runtime behavior.
https://bugs.python.org/issue45020
* Makefile.pre.in: Add $(srcdir) when needed, remove it when it was
used by mistake.
* freeze_modules.py tool uses ./Programs/_freeze_module if the
executable doesn't exist in the source tree.
The update_file.py tool now preserves the end of line of the updated
file. Fix the "make regen-frozen" command: it no longer changes the
end of line of PCbuild/ files on Unix. Git changes the end of line
depending on the platform.
The main advantage is that the files will no longer show up in diffs and PRs. That means, for a PR, the number of files / lines changed will more clearly reflect the actual change. (This is essentially an un-revert of gh-28375.)
https://bugs.python.org/issue45020
The main advantage is that the files will no longer show up in diffs and PRs. That means, for a PR, the number of files / lines changed will more clearly reflect the actual change.
https://bugs.python.org/issue45020
Here's one more small cleanup that should have been in PR gh-28319. We eliminate stdout side-effects from importing the frozen __hello__ module, and update tests accordingly. We also move the module's source file into Lib/ from Toos/freeze/flag.py.
https://bugs.python.org/issue45019
This will enable us to drop the frozen module header files from the repository.
It does currently cause many source files to be built twice, which just takes more time. For whoever comes to fix this in the future, the files shared between freeze_module and pythoncore should be put into a static library that is consumed by both.
Doing this provides significant performance gains for runtime startup (~15% with all the imported modules frozen). We don't yet freeze all the imported modules because there are a few hiccups in the build systems we need to sort out first. (See bpo-45186 and bpo-45188.)
Note that in PR GH-28320 we added a command-line flag (-X frozen_modules=[on|off]) that allows users to opt out of (or into) using frozen modules. The default is still "off" but we will change it to "on" as soon as we can do it in a way that does not cause contributors pain.
https://bugs.python.org/issue45020
There are a few things I missed in gh-27980. This is a follow-up that will make subsequent PRs cleaner. It includes fixes to tests and tools that reference the frozen modules.
https://bugs.python.org/issue45019
Frozen modules must be added to several files in order to work properly. Before this change this had to be done manually. Here we add a tool to generate the relevant lines in those files instead. This helps us avoid mistakes and omissions.
https://bugs.python.org/issue45019
* Remove struct _node from the stable ABI list
This struct was removed along with the old parser in Python 3.9 (PEP 617)
* Stable ABI list: Use the public name "PyFrameObject" rather than "_frame"
* Ensure limited API doesn't contain private names
Names prefixed by an underscore are private by definition.
* Add a blurb
* Specialize LOAD_ATTR with LOAD_ATTR_SLOT and LOAD_ATTR_SPLIT_KEYS
* Move dict-common.h to internal/pycore_dict.h
* Add LOAD_ATTR_WITH_HINT specialized opcode.
* Quicken in function if loopy
* Specialize LOAD_ATTR for module attributes.
* Add specialization stats
The patch from [bpo-44074]() does not account for a possibly non-English locale and blindly greps for "HEAD branch" in a possibly localized text.
Automerge-Triggered-By: GH:pitrou
- Reformat the C API and ABI Versioning page (and extend/clarify a bit)
- Rewrite the stable ABI docs into a general text on C API Compatibility
- Add a list of Limited API contents, and notes for the individual items.
- Replace `Include/README.rst` with a link to a devguide page with the same info
"Zero cost" exception handling.
* Uses a lookup table to determine how to handle exceptions.
* Removes SETUP_FINALLY and POP_TOP block instructions, eliminating (most of) the runtime overhead of try statements.
* Reduces the size of the frame object by about 60%.
The stable_abi.py script no longer parse macros. Macro targets can be
static inline functions which are not part of the stable ABI, only
part of the limited C API.
Run "make regen-limited-abi" to exclude PyType_HasFeature from
Doc/data/stable_abi.dat.
Rename Include/symtable.h to to Include/internal/pycore_symtable.h,
don't export symbols anymore (replace PyAPI_FUNC and PyAPI_DATA with
extern) and rename functions:
* PyST_GetScope() to _PyST_GetScope()
* PySymtable_BuildObject() to _PySymtable_Build()
* PySymtable_Free() to _PySymtable_Free()
Remove PySymtable_Build(), Py_SymtableString() and
Py_SymtableStringObject() functions.
The Py_SymtableString() function was part the stable ABI by mistake
but it could not be used, since the symtable.h header file was
excluded from the limited C API.
The Python symtable module remains available and is unchanged.
Add frozen modules to sys.stdlib_module_names. For example, add
"_frozen_importlib" and "_frozen_importlib_external" names.
Add "list_frozen" command to Programs/_testembed.
Include/{odictobject.h,parser_interface.h,picklebufobject.h,pydebug.h,pyfpe.h}
into Include/cpython/.
Parser: peg_api: include Python.h instead of parser_interface.h.
Add a new configure --without-static-libpython option to not build
the libpythonMAJOR.MINOR.a static library and not install the
python.o object file.
Fix smelly.py and stable_abi.py tools when libpython3.10.a is
missing.
Add a private list of all stdlib modules: _Py_module_names.
* Add Tools/scripts/generate_module_names.py script.
* Makefile: Add "make regen-module-names" command.
* setup.py: Add --list-module-names option.
* GitHub Action and Travis CI also runs "make regen-module-names",
not ony "make regen-all", to ensure that the module names remains
up to date.
The smelly.py script now also checks the Python dynamic library and
extension modules, not only the Python static library. Make also the
script more verbose: explain what it does.
The GitHub Action job now builds Python with the libpython dynamic
library.
This is one of the few files that has intimate knowledge of the pyc file
format. Since it lacks tests it tends to become outdated fairly quickly.
At present it has been broken since the introduction of PEP 552.
When there is a SyntaxError after reading the last input character from
the tokenizer and if no newline follows it, the error message used to be
`unexpected EOF while parsing`, which is wrong.
Break up COMPARE_OP into four logically distinct opcodes:
* COMPARE_OP for rich comparisons
* IS_OP for 'is' and 'is not' tests
* CONTAINS_OP for 'in' and 'is not' tests
* JUMP_IF_NOT_EXC_MATCH for checking exceptions in 'try-except' statements.
This is the converse of GH-15353 -- in addition to plenty of
scripts in the tree that are marked with the executable bit
(and so can be directly executed), there are a few that have
a leading `#!` which could let them be executed, but it doesn't
do anything because they don't have the executable bit set.
Here's a command which finds such files and marks them. The
first line finds files in the tree with a `#!` line *anywhere*;
the next-to-last step checks that the *first* line is actually of
that form. In between we filter out files that already have the
bit set, and some files that are meant as fragments to be
consumed by one or another kind of preprocessor.
$ git grep -l '^#!' \
| grep -vxFf <( \
git ls-files --stage \
| perl -lane 'print $F[3] if (!/^100644/)' \
) \
| grep -ve '\.in$' -e '^Doc/includes/' \
| while read f; do
head -c2 "$f" | grep -qxF '#!' \
&& chmod a+x "$f"; \
done
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.
"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.
Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".
The documentation contains now strings for operators and punctuation tokens.
Two kind of mistakes:
1. Missed space. After concatenating there is no space between words.
2. Missed comma. Causes unintentional concatenating in a list of strings.