The os module and the PyUnicode_FSDecoder() function no longer accept
bytes-like paths, like bytearray and memoryview types: only the exact
bytes type is accepted for bytes strings.
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
* Stores all location info in linetable to conform to PEP 626.
* Remove column table from code objects.
* Remove end-line table from code objects.
* Document new location table format
* Moves the bytecode to the end of the corresponding PyCodeObject, and quickens it in-place.
* Removes the almost-always-unused co_varnames, co_freevars, and co_cellvars member caches
* _PyOpcode_Deopt is a new mapping from all opcodes to their un-quickened forms.
* _PyOpcode_InlineCacheEntries is renamed to _PyOpcode_Caches
* _Py_IncrementCountAndMaybeQuicken is renamed to _PyCode_Warmup
* _Py_Quicken is renamed to _PyCode_Quicken
* _co_quickened is renamed to _co_code_adaptive (and is now a read-only memoryview).
* Do not emit unused nonzero opargs anymore in the compiler.
* Add PRECALL_FUNCTION opcode.
* Move 'call shape' varaibles into struct.
* Replace CALL_NO_KW and CALL_KW with KW_NAMES and CALL instructions.
* Specialize for builtin methods taking using the METH_FASTCALL | METH_KEYWORDS protocol.
* Allow kwnames for specialized calls to builtin types.
* Specialize calls to tuple(arg) and str(arg).
* Add RETURN_GENERATOR and JUMP_NO_INTERRUPT opcodes.
* Trim frame and generator by word each.
* Minor refactor of frame.c
* Update test.test_sys to account for smaller frames.
* Treat generator functions as normal functions when evaluating and specializing.
This PR is part of PEP 657 and augments the compiler to emit ending
line numbers as well as starting and ending columns from the AST
into compiled code objects. This allows bytecodes to be correlated
to the exact source code ranges that generated them.
This information is made available through the following public APIs:
* The `co_positions` method on code objects.
* The C API function `PyCode_Addr2Location`.
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
Co-authored-by: Ammar Askar <ammar@ammaraskar.com>
Emit a deprecation warning if the numeric literal is immediately followed by
one of keywords: and, else, for, if, in, is, or. Raise a syntax error with
more informative message if it is immediately followed by other keyword or
identifier.
Automerge-Triggered-By: GH:pablogsal
* Modify compiler to reduce stack consumption for large expressions.
* Add more tests for stack usage.
* Add NEWS item.
* Raise SystemError for truly excessive stack use.
* Delete jump instructions that bypass empty blocks
* Add news entry
* Explicitly check for unconditional jump opcodes
Using the is_jump function results in the inclusion of instructions like
returns for which this optimization is not really valid. So, instead
explicitly check that the instruction is an unconditional jump.
* Handle conditional jumps, delete jumps gracefully
* Ensure b_nofallthrough and b_reachable are valid
* Add test for redundant jumps
* Regenerate importlib.h and edit Misc/ACKS
* Fix bad whitespace
After parsing is done in single statement mode, the tokenizer buffer has to be checked for additional lines and a `SyntaxError` must be raised, in case there are any.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* Rename PyConfig.use_peg to _use_peg_parser
* Document PyConfig._use_peg_parser and mark it a deprecated
* Mark -X oldparser option and PYTHONOLDPARSER env var as deprecated
in the documentation.
* Add use_old_parser() and skip_if_new_parser() to test.support
* Remove sys.flags.use_peg: use_old_parser() uses
_testinternalcapi.get_configs() instead.
* Enhance test_embed tests
* subprocess._args_from_interpreter_flags() copies -X oldparser
There are some same consts in a module. This commit merges them into
single instance. It reduces number of objects in memory after loading modules.
https://bugs.python.org/issue34100
Fix an off by one error in the peephole optimizer when checking for unreachable code beyond a return.
Do a bounds check within find_op so it can return before going past the end as a safety measure.
7db3c48833 (diff-a33329ae6ae0bb295d742f0caf93c137)
introduced this off by one error while fixing another one nearby.
This bug was shipped in all Python 3.6 and 3.7 releases.
The included unittest won't fail unless you do a clang msan build.
Two kind of mistakes:
1. Missed space. After concatenating there is no space between words.
2. Missed comma. Causes unintentional concatenating in a list of strings.
Issue #25843: When compiling code, don't merge constants if they are equal but
have a different types. For example, "f1, f2 = lambda: 1, lambda: 1.0" is now
correctly compiled to two different functions: f1() returns 1 (int) and f2()
returns 1.0 (int), even if 1 and 1.0 are equal.
Add a new _PyCode_ConstantKey() private function.
This avoids possible buffer overreads when int(), float(), compile(), exec()
and eval() are passed bytes-like objects. Similar code is removed from the
complex() constructor, where it was not reachable.
Patch by John Leitch, Serhiy Storchaka and Martin Panter.
The concept of .pyo files no longer exists. Now .pyc files have an
optional `opt-` tag which specifies if any extra optimizations beyond
the peepholer were applied.
Previously, excessive nesting in expressions would blow the
stack and segfault the interpreter. Now, a hard limit based
on the configured recursion limit and a hardcoded scaling
factor is applied.