Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
If an HTTP link is redirected to a same looking HTTPS link, the latter can
be used directly without changes in readability and behavior.
It protects from a men-in-the-middle attack.
This change does not affect Python examples.
Support for bytes broke sometime between Python 3.2 and 3.6 and has been broken ever since. Trying to bring back supports is surprisingly difficult in the face of -b and checking for keys in sys.path_importer_cache. Since the support was broken for so long, trying to overcome the difficulty of bringing back the support has been deemed not worth it.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Add the -P command line option and the PYTHONSAFEPATH environment
variable to not prepend a potentially unsafe path to sys.path.
* Add sys.flags.safe_path flag.
* Add PyConfig.safe_path member.
* Programs/_bootstrap_python.c uses config.safe_path=0.
* Update subprocess._optim_args_from_interpreter_flags() to handle
the -P command line option.
* Modules/getpath.py sets safe_path to 1 if a "._pth" file is
present.
This is true of all dictionaries in Python, but this one tends to
catch people off guard as they don't realize when sys.modules might
change out from underneath them as a hidden side effect of their
code. Copying it first avoids the RuntimeError. An example when
this happens in single threaded code are codecs being loaded which
are an implicit time of use import that most need not think about.
The sys module uses the kernel32.dll version number, which can vary from the "actual" Windows version.
Since the best option for getting the version is WMI (which is expensive), we switch back to launching cmd.exe (which is also expensive, but a lot less code on our part).
sys.getwindowsversion() is not updated to avoid launching executables from that module.
Add Doc/using/configure.rst documentation to document configure,
preprocessor, compiler and linker options.
Add a new section about the "Python debug build".
Enhance the documentation of the Python startup, filesystem encoding
and error handling, locale encoding. Add a new "Python UTF-8 Mode"
section.
* Add "locale encoding" and "filesystem encoding and error handler"
to the glossary
* Remove documentation from Include/cpython/initconfig.h: move it to
Doc/c-api/init_config.rst.
* Doc/c-api/init_config.rst:
* Document command line options and environment variables
* Document default values.
* Add a new "Python UTF-8 Mode" section in Doc/library/os.rst.
* Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs.
* Document how Python selects the filesystem encoding and error
handler at a single place: PyConfig.filesystem_encoding and
PyConfig.filesystem_errors.
* PyConfig: move orig_argv member at the right place.
This adds a new function named sys._current_exceptions() which is equivalent ot
sys._current_frames() except that it returns the exceptions currently handled
by other threads. It is equivalent to calling sys.exc_info() for each running
thread.
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.
In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.
In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.
In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.
Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.
While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)
Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
This commit reverts commit ac0333e1e1 as the original links are working again and they provide extended features such as comments and alternative versions.
Add sys.orig_argv attribute: the list of the original command line
arguments passed to the Python executable.
Rename also PyConfig._orig_argv to PyConfig.orig_argv and
document it.
Try to make the meaning of platlibdir clear. The previous wording could
be misinterpreted to suggest that it will be used to find all shared
libraries on the system, and not just Python extensions. Furthermore,
it was unclear whether it affects third-party (site-packages) extensions
or not. The new wording tries to make its dual purpose clear,
and provide the additional example of extensions in site-packages.
Add --with-platlibdir option to the configure script: name of the
platform-specific library directory, stored in the new sys.platlitdir
attribute. It is used to build the path of platform-specific dynamic
libraries and the path of the standard library.
It is equal to "lib" on most platforms. On Fedora and SuSE, it is
equal to "lib64" on 64-bit systems.
Co-Authored-By: Jan Matějek <jmatejek@suse.com>
Co-Authored-By: Matěj Cepl <mcepl@cepl.eu>
Co-Authored-By: Charalampos Stratakis <cstratak@redhat.com>
Minor fix in documentation:
- `sys.__unraisablehook__` is new in version 3.8
- Optional `sep` and `bytes_per_sep` parameters for `bytearray.hex` is also supported in Python 3.8 (just like `bytes.hex`)
* "Return true/false" is replaced with "Return ``True``/``False``"
if the function actually returns a bool.
* Fixed formatting of some True and False literals (now in monospace).
* Replaced "True/False" with "true/false" if it can be not only bool.
* Replaced some 1/0 with True/False if it corresponds the code.
* "Returns <bool>" is replaced with "Return <bool>".