Commit Graph

11924 Commits

Author SHA1 Message Date
Victor Stinner 82458b6cdb
bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Alexey Izbyshev d3b4e06807
bpo-42146: Unify cleanup in subprocess_fork_exec() (GH-22970)
* bpo-42146: Unify cleanup in subprocess_fork_exec()

Also ignore errors from _enable_gc():
* They are always suppressed by the current code due to a bug.
* _enable_gc() is only used if `preexec_fn != None`, which is unsafe.
* We don't have a good way to handle errors in case we successfully
  created a child process.

Co-authored-by: Gregory P. Smith <greg@krypto.org>
2020-10-31 22:33:08 -07:00
Erlend Egeberg Aasland 7d21027157
bpo-40956: Convert _sqlite3 module level functions to Argument Clinic (GH-22484) 2020-10-31 15:07:44 +09:00
Victor Stinner b62bdf71ea
bpo-42208: Add _locale._get_locale_encoding() (GH-23052)
* Add a new _locale._get_locale_encoding() function to get the
  current locale encoding.
* Modify locale.getpreferredencoding() to use it.
* Remove the _bootlocale module.
2020-10-31 01:32:11 +01:00
Victor Stinner 710e826307
bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
Victor Stinner eba5bf2f56
bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044)
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().

Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
2020-10-30 22:51:02 +01:00
Victor Stinner 8b3414818f
bpo-42208: Pass tstate to _PyGC_CollectNoFail() (GH-23038)
Move private _PyGC_CollectNoFail() to the internal C API.

Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.

Rename functions:

* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
2020-10-30 17:00:00 +01:00
Victor Stinner 5776663675
bpo-42029: Remove IRIX code (GH-23023)
IRIX code was slowy removed in Python 2.4 (--with-sgi-dl), Python 3.3
(Irix threads), and Python 3.7.
2020-10-29 15:16:23 +01:00
Victor Stinner 35b95aaf21
bpo-42161: Micro-optimize _collections._count_elements() (GH-23008)
Move the _PyLong_GetOne() call outside the fast-path loop.
2020-10-27 22:24:33 +01:00
Victor Stinner 37834136d0
bpo-42161: Modules/ uses _PyLong_GetZero() and _PyLong_GetOne() (GH-22998)
Use _PyLong_GetZero() and _PyLong_GetOne() in Modules/ directory.

_cursesmodule.c and zoneinfo.c are now built with
Py_BUILD_CORE_MODULE macro defined.
2020-10-27 17:12:53 +01:00
Victor Stinner 84f7382215
bpo-42157: Rename unicodedata.ucnhash_CAPI (GH-22994)
Removed the unicodedata.ucnhash_CAPI attribute which was an internal
PyCapsule object. The related private _PyUnicode_Name_CAPI structure
was moved to the internal C API.

Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
2020-10-27 04:36:22 +01:00
Victor Stinner c8c4200b65
bpo-42157: Convert unicodedata.UCD to heap type (GH-22991)
Convert the unicodedata extension module to the multiphase
initialization API (PEP 489) and convert the unicodedata.UCD static
type to a heap type.

Co-Authored-By: Mohamed Koubaa <koubaa.m@gmail.com>
2020-10-26 23:19:22 +01:00
Victor Stinner 920cb647ba
bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:

  * Remove size and state members
  * Remove state and self parameters of getcode() and getname()
    functions

* Remove global_module_state
2020-10-26 19:19:36 +01:00
Victor Stinner 47e1afd2a1
bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.

* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
2020-10-26 16:43:47 +01:00
Serhiy Storchaka b510e101f8
bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986)
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.

And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
2020-10-26 12:47:57 +02:00
Serhiy Storchaka fb5db7ec58
bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)
These functions are considered not safe because they suppress all internal errors
and can return wrong result.  PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.

Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
2020-10-26 08:43:39 +02:00
Alexey Izbyshev c0590c0033
bpo-42146: Fix memory leak in subprocess.Popen() in case of uid/gid overflow (GH-22966)
Fix memory leak in subprocess.Popen() in case of uid/gid overflow

Also add a test that would catch this leak with `--huntrleaks`.

Alas, the test for `extra_groups` also exposes an inconsistency
in our error reporting: we use a custom ValueError for `extra_groups`,
but propagate OverflowError for `user` and `group`.
2020-10-25 17:09:32 -07:00
Zackery Spytz c32f2976b8
bpo-42144: Add a missing "goto error;" in the _ssl module (GH-22959) 2020-10-25 11:02:30 -07:00
Gregory P. Smith be3c3a0e46
bpo-35823: Allow setsid() after vfork() on Linux. (GH-22945)
It should just be a syscall updating a couple of fields in the kernel side
process info.  Confirming, in glibc is appears to be a shim for the setsid
syscall (based on not finding any code implementing anything special for it)
and in uclibc (*much* easier to read) it is clearly just a setsid syscall shim.

A breadcrumb _suggesting_ that it is not allowed on Darwin/macOS comes from
a commit in emacs: https://lists.gnu.org/archive/html/bug-gnu-emacs/2017-04/msg00297.html
but I don't have a way to verify if that is true or not.
As we are not supporting vfork on macOS today I just left a note in a comment.
2020-10-24 12:07:35 -07:00
Serhiy Storchaka 8cd1dbae32
bpo-41052: Fix pickling heap types implemented in C with protocols 0 and 1 (GH-22870) 2020-10-24 21:14:23 +03:00
Alexey Izbyshev 473db47747
bpo-35823: subprocess: Fix handling of pthread_sigmask() errors (GH-22944)
Using POSIX_CALL() is incorrect since pthread_sigmask() returns
the error number instead of setting errno.

Also handle failure of the first call to pthread_sigmask()
in the parent process, and explain why we don't handle failure
of the second call in a comment.
2020-10-24 10:47:38 -07:00
Alexey Izbyshev 976da903a7
bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe (GH-11671)
* bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe

When used to run a new executable image, fork() is not a good choice
for process creation, especially if the parent has a large working set:
fork() needs to copy page tables, which is slow, and may fail on systems
where overcommit is disabled, despite that the child is not going to
touch most of its address space.

Currently, subprocess is capable of using posix_spawn() instead, which
normally provides much better performance. However, posix_spawn() does not
support many of child setup operations exposed by subprocess.Popen().
Most notably, it's not possible to express `close_fds=True`, which
happens to be the default, via posix_spawn(). As a result, most users
can't benefit from faster process creation, at least not without
changing their code.

However, Linux provides vfork() system call, which creates a new process
without copying the address space of the parent, and which is actually
used by C libraries to efficiently implement posix_spawn(). Due to sharing
of the address space and even the stack with the parent, extreme care
is required to use vfork(). At least the following restrictions must hold:

* No signal handlers must execute in the child process. Otherwise, they
  might clobber memory shared with the parent, potentially confusing it.

* Any library function called after vfork() in the child must be
  async-signal-safe (as for fork()), but it must also not interact with any
  library state in a way that might break due to address space sharing
  and/or lack of any preparations performed by libraries on normal fork().
  POSIX.1 permits to call only execve() and _exit(), and later revisions
  remove vfork() specification entirely. In practice, however, almost all
  operations needed by subprocess.Popen() can be safely implemented on
  Linux.

* Due to sharing of the stack with the parent, the child must be careful
  not to clobber local variables that are alive across vfork() call.
  Compilers are normally aware of this and take extra care with vfork()
  (and setjmp(), which has a similar problem).

* In case the parent is privileged, special attention must be paid to vfork()
  use, because sharing an address space across different privilege domains
  is insecure[1].

This patch adds support for using vfork() instead of fork() on Linux
when it's possible to do safely given the above. In particular:

* vfork() is not used if credential switch is requested. The reverse case
  (simple subprocess.Popen() but another application thread switches
  credentials concurrently) is not possible for pure-Python apps because
  subprocess.Popen() and functions like os.setuid() are mutually excluded
  via GIL. We might also consider to add a way to opt-out of vfork() (and
  posix_spawn() on platforms where it might be implemented via vfork()) in
  a future PR.

* vfork() is not used if `preexec_fn != None`.

With this change, subprocess will still use posix_spawn() if possible, but
will fallback to vfork() on Linux in most cases, and, failing that,
to fork().

[1] https://ewontfix.com/7

Co-authored-by: Gregory P. Smith [Google LLC] <gps@google.com>
2020-10-23 17:47:01 -07:00
Christian Heimes dde91b1953
bpo-1635741: Fix NULL ptr deref in multiprocessing (GH-22880)
Commit 1d541c25c8 introduced a NULL
pointer dereference in error path.

Signed-off-by: Christian Heimes <christian@python.org>
2020-10-22 03:20:36 -07:00
Dong-hee Na 8a9463f2d4
_testmultiphase: Fix possible ref leak (GH-22881)
This is just test code, but sometimes external contributors reference the code snippets from test code.
`PyModule_AddObject` should be handled in the proper way.

https://docs.python.org/3/c-api/module.html#c.PyModule_AddObject
2020-10-22 02:44:18 -07:00
Vladimir Matveev c8ba47b551
Delete TaskWakeupMethWrapper_Type and use PyCFunction instead (#22875) 2020-10-21 17:49:10 -07:00
TIGirardi f2312037e3
bpo-38324: Fix test__locale.py Windows failures (GH-20529)
Use wide-char _W_* fields of lconv structure on Windows
Remove "ps_AF" from test__locale.known_numerics on Windows
2020-10-20 12:39:52 +01:00
Jakub Jelen d5d0521270
md5module: Fix doc strings variable names (GH-22722) 2020-10-20 18:10:43 +09:00
Raymond Hettinger 871934d4cf
bpo-4356: Add key function support to the bisect module (GH-20556) 2020-10-19 22:04:01 -07:00
Ruben Vorderman 23c0fb8edd
bpo-41586: Add pipesize parameter to subprocess & F_GETPIPE_SZ and F_SETPIPE_SZ to fcntl. (GH-21921)
* Add F_SETPIPE_SZ and F_GETPIPE_SZ to fcntl module
* Add pipesize parameter for subprocess.Popen class

This will allow the user to control the size of the pipes.
On linux the default is 64K. When a pipe is full it blocks for writing.
When a pipe is empty it blocks for reading. On processes that are
very fast this can lead to a lot of wasted CPU cycles. On a typical
Linux system the max pipe size is 1024K which is much better.
For high performance-oriented libraries such as xopen it is nice to
be able to set the pipe size.

The workaround without this feature is to use my_popen_process.stdout.fileno() in
conjuction with fcntl and 1031 (value of F_SETPIPE_SZ) to acquire this behavior.
2020-10-19 16:30:02 -07:00
Jason R. Coombs 5456e78f45
bpo-16396: Allow wintypes to be imported on non-Windows systems. (GH-21394)
Co-authored-by: Christian Heimes <christian@python.org>
2020-10-19 23:06:05 +01:00
Serhiy Storchaka 1bcaa81e52
bpo-20184: Convert termios to Argument Clinic. (GH-22693) 2020-10-18 17:54:06 +03:00
Hai Shi c9f696cb96
bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513)
* Move the codecs' (un)register operation to testcases.
* Remove _codecs._forget_codec() and _PyCodec_Forget()
2020-10-16 10:34:15 +02:00
Victor Stinner e6b8c5263a
bpo-1635741: Add a global module state to unicodedata (GH-22712)
Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
2020-10-15 16:22:19 +02:00
Erlend Egeberg Aasland 644e94272a
bpo-42021: Fix possible ref leaks during _sqlite3 module init (GH-22673) 2020-10-15 21:20:15 +09:00
Brandt Bucher c13b847a6f
bpo-41984: GC track all user classes (GH-22701) 2020-10-14 18:44:07 -07:00
Kyle Evans 7992579cd2
bpo-40422: Move _Py_closerange to fileutils.c (GH-22680)
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.

Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
2020-10-13 22:04:44 +02:00
Yunlongs 66c28f50c7
bpo-41995: Fix null ptr deref in tracemalloc_copy_trace() (GH-22660)
Fix a null pointer dereference in tracemalloc_copy_trace()
of _tracemalloc.
2020-10-13 08:46:31 +02:00
Kyle Evans 64eb259cc1
bpo-40422: Move _Py_*_SUPPRESS_IPH bits into _Py_closerange (GH-22672)
This suppression is no longer needed in os_closerange_impl, as it just
invokes the internal _Py_closerange implementation. On the other hand,
consumers of _Py_closerange may not have any other reason to suppress
invalid parameter issues, so narrow the scope to here.
2020-10-12 16:53:16 -07:00
Kyle Evans 1800c60080
bpo-40423: Optimization: use close_range(2) if available (GH-22651)
close_range(2) should be preferred at all times if it's available, otherwise we'll use closefrom(2) if available with a fallback to fdwalk(3) or plain old loop over fd range in order of most efficient to least.

[note that this version does check for ENOSYS, but currently ignores all other errors]

Automerge-Triggered-By: @pablogsal
2020-10-11 13:18:53 -07:00
Kyle Evans c230fde847
bpo-40422: create a common _Py_closerange API (GH-19754)
Such an API can be used both for os.closerange and subprocess. For the latter, this yields potential improvement for platforms that have fdwalk but wouldn't have used it there. This will prove even more beneficial later for platforms that have close_range(2), as the new API will prefer that over all else if it's available.

The new API is structured to look more like close_range(2), closing from [start, end] rather than the [low, high) of os.closerange().

Automerge-Triggered-By: @gpshead
2020-10-11 11:54:11 -07:00
Vladimir Matveev 037245c5ac
bpo-41756: Add PyIter_Send function (#22443) 2020-10-09 17:15:15 -07:00
Serhiy Storchaka 9975cc5008
bpo-41985: Add _PyLong_FileDescriptor_Converter and AC converter for "fildes". (GH-22620) 2020-10-09 23:00:45 +03:00
Raymond Hettinger 4e0ce82058
Revert "bpo-26680: Incorporate is_integer in all built-in and standard library numeric types (GH-6121)" (GH-22584)
This reverts commit 58a7da9e12.
2020-10-07 16:43:44 -07:00
Ram Rachum 52301312bb
bpo-41867: List options for timespec in docstrings of isoformat methods (GH-22418) 2020-10-03 13:43:47 +03:00
Raymond Hettinger 497126f7ea
Update link to supporting references (GH-22488) 2020-10-01 19:30:54 -07:00
Robert Smallshire 58a7da9e12
bpo-26680: Incorporate is_integer in all built-in and standard library numeric types (GH-6121)
* bpo-26680: Adds support for int.is_integer() for compatibility with float.is_integer().

The int.is_integer() method always returns True.

* bpo-26680: Adds a test to ensure that False.is_integer() and True.is_integer() are always True.

* bpo-26680: Adds Real.is_integer() with a trivial implementation using conversion to int.

This default implementation is intended to reduce the workload for subclass
implementers. It is not robust in the presence of infinities or NaNs and
may have suboptimal performance for other types.

* bpo-26680: Adds Rational.is_integer which returns True if the denominator is one.

This implementation assumes the Rational is represented in it's
lowest form, as required by the class docstring.

* bpo-26680: Adds Integral.is_integer which always returns True.

* bpo-26680: Adds tests for Fraction.is_integer called as an instance method.

The tests for the Rational abstract base class use an unbound
method to sidestep the inability to directly instantiate Rational.
These tests check that everything works correct as an instance method.

* bpo-26680: Updates documentation for Real.is_integer and built-ins int and float.

The call x.is_integer() is now listed in the table of operations
which apply to all numeric types except complex, with a reference
to the full documentation for Real.is_integer().  Mention of
is_integer() has been removed from the section 'Additional Methods
on Float'.

The documentation for Real.is_integer() describes its purpose, and
mentions that it should be overridden for performance reasons, or
to handle special values like NaN.

* bpo-26680: Adds Decimal.is_integer to the Python and C implementations.

The C implementation of Decimal already implements and uses
mpd_isinteger internally, we just expose the existing function to
Python.

The Python implementation uses internal conversion to integer
using to_integral_value().

In both cases, the corresponding context methods are also
implemented.

Tests and documentation are included.

* bpo-26680: Updates the ACKS file.

* bpo-26680: NEWS entries for int, the numeric ABCs and Decimal.

Co-authored-by: Robert Smallshire <rob@sixty-north.com>
2020-10-01 17:30:08 +01:00
Erlend Egeberg Aasland 256e54acdb
bpo-41861: Convert _sqlite3 CursorType and ConnectionType to heap types (GH-22478) 2020-10-01 16:03:21 +02:00
Erlend Egeberg Aasland 9031bd4fa4
bpo-41861: Convert _sqlite3 RowType and StatementType to heap types (GH-22444) 2020-10-01 15:24:31 +02:00
Erlend Egeberg Aasland cb6db8b6ae
bpo-41861: Convert _sqlite3 PrepareProtocolType to heap type (GH-22428) 2020-09-29 00:05:04 +02:00
Hai Shi d332e7b816
bpo-41842: Add codecs.unregister() function (GH-22360)
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
2020-09-28 23:41:11 +02:00