Call SQLite API's first, and return early in case of error. At the end,
allocate the object and initialise members. We now avoid unneeded
alloc/dealloc's in case the statement creation fails.
Convert the Py_TYPE() and Py_SIZE() macros to static inline
functions. The Py_SET_TYPE() and Py_SET_SIZE() functions must now be
used to set an object type and size.
The sqlite3 module now fully implements the GC protocol, so there's no
need for this workaround anymore.
- Add and use managed resource helper for connections using TESTFN
- Sort test imports
- Split open-tests into their own test case
Automerge-Triggered-By: GH:vstinner
Allocate and track statement objects in pysqlite_statement_create.
By allocating and tracking creation of statement object in
pysqlite_statement_create(), the caller does not need to worry about GC
syncronization, and eliminates the possibility of getting a badly
created object. All related fault handling is moved to
pysqlite_statement_create().
Co-authored-by: Victor Stinner <vstinner@python.org>
* _testcapi.heapgctype: implement a traverse function since the type
is defined with Py_TPFLAGS_HAVE_GC.
* _decimal: PyDecSignalDictMixin_Type is no longer defined with
Py_TPFLAGS_HAVE_GC since it has no traverse function.
The newest gcc emmits this warning:
```
/Modules/_tkinter.c:272:9: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
272 | if(tcl_lock)PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = tstate; }
| ^~
/Modules/_tkinter.c:2869:5: note: in expansion of macro ‘LEAVE_PYTHON’
2869 | LEAVE_PYTHON
| ^~~~~~~~~~~~
/Modules/_tkinter.c:243:5: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
243 | (*(PyThreadState**)Tcl_GetThreadData(&state_key, sizeof(PyThreadState*)))
| ^
/Modules/_tkinter.c:272:57: note: in expansion of macro ‘tcl_tstate’
272 | if(tcl_lock)PyThread_acquire_lock(tcl_lock, 1); tcl_tstate = tstate; }
| ^~~~~~~~~~
/Modules/_tkinter.c:2869:5: note: in expansion of macro ‘LEAVE_PYTHON’
2869 | LEAVE_PYTHON
```
that's because the macro packs together two statements at the same level
as the "if". The warning is misleading but is very noisy so it makes
sense to fix it.
"Zero cost" exception handling.
* Uses a lookup table to determine how to handle exceptions.
* Removes SETUP_FINALLY and POP_TOP block instructions, eliminating (most of) the runtime overhead of try statements.
* Reduces the size of the frame object by about 60%.
* Fix initial buffer size can't > UINT32_MAX in zlib module
After commit f9bedb630e, in 64-bit build,
if the initial buffer size > UINT32_MAX, ValueError will be raised.
These two functions are affected:
1. zlib.decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
2. zlib.Decompress.flush([length])
This commit re-allows the size > UINT32_MAX.
* adds curly braces per PEP 7.
* Renames `Buffer_*` to `OutputBuffer_*` for clarity
Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"__new__" key in the type dictionary.
The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.
Use the flag on the following types:
* _curses.ncurses_version type
* _curses_panel.panel
* _tkinter.Tcl_Obj
* _tkinter.tkapp
* _tkinter.tktimertoken
* _xxsubinterpretersmodule.ChannelID
* sys.flags type
* sys.getwindowsversion() type
* sys.version_info type
Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.
Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.
type_new() calls _PyType_CheckConsistency() at exit.
* Add Py_TPFLAGS_SEQUENCE and Py_TPFLAGS_MAPPING, add to all relevant standard builtin classes.
* Set relevant flags on collections.abc.Sequence and Mapping.
* Use flags in MATCH_SEQUENCE and MATCH_MAPPING opcodes.
* Inherit Py_TPFLAGS_SEQUENCE and Py_TPFLAGS_MAPPING.
* Add NEWS
* Remove interpreter-state map_abc and seq_abc fields.
Add new C-API functions to control the state of the garbage collector:
PyGC_Enable(), PyGC_Disable(), PyGC_IsEnabled(),
corresponding to the functions in the gc module.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Faster bz2/lzma/zlib via new output buffering.
Also adds .readall() function to _compression.DecompressReader class
to take best advantage of this in the consume-all-output at once scenario.
Often a 5-20% speedup in common scenarios due to less data copying.
Contributed by Ma Lin.
* Add signal_state_t structure and signal_global_state variable.
* Add a module state to the _signal module.
* Move and rename variables:
* DefaultHandler becomes state->default_handler
* IgnoreHandler becomes state->ignore_handler
* sigint_event becomes state->sigint_event
* ItimerError becomes modstate->itimer_error
* Rename SetHandler() to set_handler() to be consistent with
get_handler().
Importing the _signal module in a subinterpreter has no longer side
effects.
signal_module_exec() no longer modifies Handlers and no longer attempts
to set SIGINT signal handler in subinterpreters.
The internal `_ssl._SSLSocket` object now provides methods to retrieve
the peer cert chain and verified cert chain as a list of Certificate
objects. Certificate objects have methods to convert the cert to a dict,
PEM, or DER (ASN.1).
These are private APIs for now. There is a slim chance to stabilize the
approach and provide a public API for 3.10. Otherwise I'll provide a
stable API in 3.11.
Signed-off-by: Christian Heimes <christian@python.org>
asyncio.get_event_loop() emits now a deprecation warning when it creates a new event loop.
In future releases it will became an alias of asyncio.get_running_loop().
This works by not caching the handle and instead getting the handle from
the file descriptor each time, so that if the actual handle changes by
fd redirection closing/opening the console handle beneath our feet, we
will keep working correctly.
Add pycore_moduleobject.h internal header file with static inline
functions to access module members:
* _PyModule_GetDict()
* _PyModule_GetDef()
* _PyModule_GetState()
These functions don't check at runtime if their argument has a valid
type and can be inlined even if Python is not built with LTO.
_PyType_GetModuleByDef() uses _PyModule_GetDef().
Replace PyModule_GetState() with _PyModule_GetState() in the
extension modules, considered as performance sensitive:
* _abc
* _functools
* _operator
* _pickle
* _queue
* _random
* _sre
* _struct
* _thread
* _winapi
* array
* posix
The following extensions are now built with the Py_BUILD_CORE_MODULE
macro defined, to be able to use the internal pycore_moduleobject.h
header: _abc, array, _operator, _queue, _sre, _struct.
The ssl module now uses ``SSL_read_ex`` and ``SSL_write_ex``
internally. The functions support reading and writing of data larger
than 2 GB. Writing zero-length data no longer fails with a protocol
violation error.
Signed-off-by: Christian Heimes <christian@python.org>
Commit 93d50a6a8d / GH-21855 changed the
order of variable definitions, which introduced a potential invalid free
bug. Py_buffer object is now initialized earlier and the result of
Keccak initialize is verified.
Co-authored-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Christian Heimes <christian@python.org>
Co-authored-by: Alex Henrie <alexhenrie24@gmail.com>
- Introduce sslmodule_slots
- Introduce sslmodulestate
- Use sslmodulestate
- Get rid of PyState_FindModule
- Move new structs and helpers to header file
- Use macros to access state
- Keep a strong ref to socket type
- Remove HAVE_X509_VERIFY_PARAM_SET1_HOST check
- Update hashopenssl to require OpenSSL 1.1.1
- multissltests only OpenSSL > 1.1.0
- ALPN is always supported
- SNI is always supported
- Remove deprecated NPN code. Python wrappers are no-op.
- ECDH is always supported
- Remove OPENSSL_VERSION_1_1 macro
- Remove locking callbacks
- Drop PY_OPENSSL_1_1_API macro
- Drop HAVE_SSL_CTX_CLEAR_OPTIONS macro
- SSL_CTRL_GET_MAX_PROTO_VERSION is always defined now
- security level is always available now
- get_num_tickets is available with TLS 1.3
- X509_V_ERR MISMATCH is always available now
- Always set SSL_MODE_RELEASE_BUFFERS
- X509_V_FLAG_TRUSTED_FIRST is always available
- get_ciphers is always supported
- SSL_CTX_set_keylog_callback is always available
- Update Modules/Setup with static link example
- Mention PEP in whatsnew
- Drop 1.0.2 and 1.1.0 from GHA tests
Fix problem with ssl.SSLContext.hostname_checks_common_name. OpenSSL does not
copy hostflags from *struct SSL_CTX* to *struct SSL*.
Signed-off-by: Christian Heimes <christian@python.org>
Add the Py_Is(x, y) function to test if the 'x' object is the 'y'
object, the same as "x is y" in Python. Add also the Py_IsNone(),
Py_IsTrue(), Py_IsFalse() functions to test if an object is,
respectively, the None singleton, the True singleton or the False
singleton.
xxlimited.c and xxlimited_35.c now define the Py_LIMITED_API macro,
rather than having to do it in the build recipe.
Co-authored-by: Hai Shi <shihai1992@gmail.com>
See [PEP 597](https://www.python.org/dev/peps/pep-0597/).
* Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`.
* Add EncodingWarning
* Add io.text_encoding()
* open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled.
* _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python)
* bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding().
* What's new entry
This reverts commit 32f2eda859.
It can be re-applied if the subinterpreter PEP is approved.
Otherwise, the commit degraded performance with no offsetting
benefit.
These functions were undocumented and excluded from the limited C
API.
Most names defined by these header files were not prefixed by "Py"
and so could create names conflicts. For example, Python-ast.h
defined a "Yield" macro which was conflict with the "Yield" name used
by the Windows <winbase.h> header.
Use the Python ast module instead.
* Move Include/asdl.h to Include/internal/pycore_asdl.h.
* Move Include/Python-ast.h to Include/internal/pycore_ast.h.
* Remove ast.h header file.
* pycore_symtable.h no longer includes Python-ast.h.
Stefan Krah requested the reversal of these (unreleased) changes, quoting him:
> The capsule API does not meet my testing standards, since I've focused
on the upstream mpdecimal in the last couple of months.
> Additionally, I'd like to refine the API, perhaps together with the
Arrow community.
Automerge-Triggered-By: GH:pitrou
OpenSSL copies the internal message callback from SSL_CTX->msg_callback to
SSL->msg_callback. SSL_set_SSL_CTX() does not update SSL->msg_callback
to use the callback value of the new context.
PySSL_set_context() now resets the callback and _PySSL_msg_callback()
resets thread state in error path.
Signed-off-by: Christian Heimes <christian@python.org>
Rename Include/symtable.h to to Include/internal/pycore_symtable.h,
don't export symbols anymore (replace PyAPI_FUNC and PyAPI_DATA with
extern) and rename functions:
* PyST_GetScope() to _PyST_GetScope()
* PySymtable_BuildObject() to _PySymtable_Build()
* PySymtable_Free() to _PySymtable_Free()
Remove PySymtable_Build(), Py_SymtableString() and
Py_SymtableStringObject() functions.
The Py_SymtableString() function was part the stable ABI by mistake
but it could not be used, since the symtable.h header file was
excluded from the limited C API.
The Python symtable module remains available and is unchanged.
We can receive signals (at the C level, in `trip_signal()` in signalmodule.c) while `signal.signal` is being called to modify the corresponding handler. Later when `PyErr_CheckSignals()` is called to handle the given signal, the handler may be a non-callable object and would raise a cryptic asynchronous exception.
From the commit message:
> When the structure is packed we should always expand when needed,
> otherwise we will add some padding between the fields. This patch makes
> sure we always merge bitfields together. It also changes the field merging
> algorithm so that it handles bitfields correctly.
Automerge-Triggered-By: GH:jaraco
From the SQLite 3.5.3 changelog:
sqlite3_step() returns SQLITE_MISUSE instead of crashing when called
with a NULL parameter.
The workaround no longer needed because we no longer support
SQLite releases older than 3.7.15.
When very large data remains in TextIOWrapper, flush() may fail forever.
So prevent that data larger than chunk_size is remained in TextIOWrapper internal
buffer.
Co-Authored-By: Eryk Sun
Pass the current interpreter (interp) rather than the current Python
thread state (tstate) to internal functions which only use the
interpreter.
Modified functions:
* _PyXXX_Fini() and _PyXXX_ClearFreeList() functions
* _PyEval_SignalAsyncExc(), make_pending_calls()
* _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str()
* should_audit(), set_flags_from_config(), make_flags()
* _PyAtExit_Call()
* init_stdio_encoding()
* etc.
Replace _PyThreadState_GET() with _PyInterpreterState_GET() in
functions which only need the current interpreter, but don't need the
current Python thread state.
Replace also _PyThreadState_UncheckedGet() with _PyThreadState_GET()
in faulthandler.c, since _PyThreadState_UncheckedGet() is just an
alias to _PyThreadState_GET() in practice.
bpo-43172: readline now passes its tests when built against libedit.
Existing irreconcilable API differences remain in readline.get_begidx
and readline.get_endidx behavior based on libreadline vs libedit use.
A note about that has been documented.
In contrast to macOS, libedit is available as its own include file and
library on Linux systems to prevent file name clashes. So if both
libraries are available on the system, readline is currently chosen by
default; and if only libedit is available, it is not found at all. This
patch adds a way to link against libedit by adding the following
arguments to configure:
--with-readline link against libreadline (the default)
--with-readline=editline link against libeditline
--with-readline=no disable building the readline module
--without-readline (same)
The runtime detection of libedit vs. readline was already done in commit
7105319ada (2019-12-04, serge-sans-paille: "bpo-38634: Allow
non-apple build to cope with libedit (GH-16986)").
Fixes: GH-12076 ("bpo-13501 Build or disable readline with Editline")
Fixes: bpo-13501 ("Make libedit support more generic; port readline / libedit to FreeBSD")
Co-authored-by: Enji Cooper (ngie-eign)
Co-authored-by: Martin Panter (vadmium)
Co-authored-by: Robert Marshall (kellinm)
PyObject_RichCompareBool() returns -1 on error, but this case is
not handled by the find_in_strong_cache() function. Any exception
raised by PyObject_RichCompareBool() should be propagated.
* bpo-42979: Enhance abstract.c assertions checking slot result
Add _Py_CheckSlotResult() function which fails with a fatal error if
a slot function succeeded with an exception set or failed with no
exception set: write the slot name, the type name and the current
exception (if an exception is set).
The Py_FatalError() function and the faulthandler module now dump the
list of extension modules on a fatal error.
Add _Py_DumpExtensionModules() and _PyModule_IsExtension() internal
functions.
Convert the 6 CJK codec extension modules (_codecs_cn, _codecs_hk,
_codecs_iso2022, _codecs_jp, _codecs_kr and _codecs_tw) to the
multiphase initialization API (PEP 489).
Remove getmultibytecodec() local cache: always import
_multibytecodec. It should be uncommon to get a codec. For example,
this function is only called once per CJK codec module.
Fix a reference leak in register_maps() error path.
A compiler that doesn't define `__has_builtin` will error out when it is
used on the same line as the check for it.
Automerge-Triggered-By: GH:ronaldoussoren
The coding cookie (ex: "# coding: latin1") is now ignored in the
command passed to the -c command line option.
Since pymain_run_command() uses UTF-8, pass PyCF_IGNORE_COOKIE
compiler flag to the parser.
pymain_run_python() no longer propages compiler flags between
function calls.
Add pycore_atomic_funcs.h internal header file: similar to
pycore_atomic.h but don't require to declare variables as atomic.
Add _Py_atomic_size_get() and _Py_atomic_size_set() functions.
Port the _thread extension module to the multiphase initialization
API (PEP 489) and convert its static types to heap types.
Add a traverse function to the lock type, so the garbage collector
can break reference cycles.
"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix the following warnings on Windows:
modules\_zoneinfo.c(1224): warning C4068: unknown pragma
modules\_zoneinfo.c(1225): warning C4068: unknown pragma
modules\_zoneinfo.c(1227): warning C4068: unknown pragma
* Fix ExceptHookArgsType name: "_thread.ExceptHookArgs", instead of
"_thread._ExceptHookArgs".
* PyInit__thread() no longer intializes interp->num_threads to 0:
it is already done in PyInterpreterState_New().
* Use PyModule_AddType(), Py_NewRef() and Py_XNewRef().
* Replace str_dict variable with _Py_IDENTIFIER(__dict__).
* Remove assert(Py_IS_TYPE(obj, &Locktype)) from release_sentinel()
to avoid having to retrive the type from this callback.
* Add thread_bootstate_free()
* Rename t_bootstrap() to thread_run()
* bootstate structure: rename keyw member to kwargs
atexit._run_exitfuncs() now logs callback exceptions using
sys.unraisablehook, rather than logging them directly into
sys.stderr and raising the last exception.
Run GeneralTest of test_atexit in a subprocess since it calls
atexit._clear() which clears all atexit callbacks.
_PyAtExit_Fini() sets state->callbacks to NULL.
If config->run_filename doesn't exist, log the error into sys.stderr
using "%R" format, to escape properly unencodable characters (usually
with backslashreplace).
* Add _PyAtExit_Call() function and remove pyexitfunc and
pyexitmodule members of PyInterpreterState. The function
logs atexit callback errors using _PyErr_WriteUnraisableMsg().
* Add _PyAtExit_Init() and _PyAtExit_Fini() functions.
* Remove traverse, clear and free functions of the atexit module.
Co-authored-by: Dong-hee Na <donghee.na@python.org>
At Python exit, if a callback registered with atexit.register()
fails, its exception is now logged. Previously, only some exceptions
were logged, and the last exception was always silently ignored.
Add _PyAtExit_Call() function and remove
PyInterpreterState.atexit_func member. call_py_exitfuncs() now calls
directly _PyAtExit_Call().
The atexit module must now always be built as a built-in module.
pymain_run_file() no longer encodes the filename: pass the filename
as an object to the new _PyRun_AnyFileObject() function.
Add new private functions:
* _PyRun_AnyFileObject()
* _PyRun_InteractiveLoopObject()
* _Py_FdIsInteractive()
- Copy existing xxlimited to xxlimited53 (named for the limited API version it uses)
- Build both modules, both in debug and release
- Test both modules
Several built-in and standard library types now ensure that their internal result tuples are always tracked by the garbage collector:
- collections.OrderedDict.items
- dict.items
- enumerate
- functools.reduce
- itertools.combinations
- itertools.combinations_with_replacement
- itertools.permutations
- itertools.product
- itertools.zip_longest
- zip
Previously, they could have become untracked by a prior garbage collection.
No longer use deprecated aliases to functions:
* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
No longer use deprecated aliases to functions:
* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()
Modify also the PyMem_DEL() macro to use directly PyMem_Free().
* bpo-40791: Make compare_digest more constant-time.
The existing volatile `left`/`right` pointers guarantee that the reads will all occur, but does not guarantee that they will be _used_. So a compiler can still short-circuit the loop, saving e.g. the overhead of doing the xors and especially the overhead of the data dependency between `result` and the reads. That would change performance depending on where the first unequal byte occurs. This change removes that optimization.
(This is change #1 from https://bugs.python.org/issue40791 .)
Convert the _imp extension module to the multi-phase initialization
API (PEP 489).
* Add _PyImport_BootstrapImp() which fix a bootstrap issue: import
the _imp module before importlib is initialized.
* Add create_builtin() sub-function, used by _imp_create_builtin().
* Initialize PyInterpreterState.import_func earlier, in
pycore_init_builtins().
* Remove references to _PyImport_Cleanup(). This function has been
renamed to finalize_modules() and moved to pylifecycle.c.
This change partically reverts
commit ad3252bad9
and the commit fe2978b3b9.
Many third party C extension modules rely on the ability of using
Py_TYPE() to set an object type: "Py_TYPE(obj) = type;" or to set an
object type using: "Py_SIZE(obj) = size;".
* Add signal_add_constants() function and add ADD_INT_MACRO macro.
* The Python SIGINT handler is now installed at the end of
signal_exec().
* Use Py_NewRef().
bpo-41686, bpo-41713: On Windows, the SIGINT event,
_PyOS_SigintEvent(), is now created even if Python is configured to
not install signal handlers (PyConfig.install_signal_handlers=0 or
Py_InitializeEx(0)).
Changes:
* Move global variables initialization from signal_exec() to
_PySignal_Init() to clarify that they are global variables cleared
by _PySignal_Fini().
* _PySignal_Fini() now closes sigint_event.
* IntHandler is no longer a global variable.
Remove the undocumented PyOS_InitInterrupts() C function.
* Rename PyOS_InitInterrupts() to _PySignal_Init(). It now installs
other signal handlers, not only SIGINT.
* Rename PyOS_FiniInterrupts() to _PySignal_Fini()
As AIX 5.3 and below do not support thread_cputime, it was decided in
https://bugs.python.org/issue40680 to require AIX 6.1 and above. This
commit removes workarounds for — and references to — older, unsupported
AIX versions.
time.time(), time.perf_counter() and time.monotonic() functions can
no longer fail with a Python fatal error, instead raise a regular
Python exception on failure.
Remove _PyTime_Init(): don't check system, monotonic and perf counter
clocks at startup anymore.
On error, _PyTime_GetSystemClock(), _PyTime_GetMonotonicClock() and
_PyTime_GetPerfCounter() now silently ignore the error and return 0.
They cannot fail with a Python fatal error anymore.
Add py_mach_timebase_info() and win_perf_counter_frequency()
sub-functions.
Explicitly cast PyExc_Exception to PyTypeObject* to fix the warning:
modules\_ctypes\_ctypes.c(5748): warning C4133: '=':
incompatible types - from 'PyObject *' to '_typeobject *'
It is no longer possible to build the _ctypes extension module
without wchar_t type: remove CTYPES_UNICODE macro. Anyway, the
wchar_t type is required to build Python.
Fix reference leaks in the error path of the initialization function
the _ctypes extension module: call Py_DECREF(mod) on error.
Change PyCFuncPtr_Type name from _ctypes.PyCFuncPtr to
_ctypes.CFuncPtr to be consistent with the name exposed in the
_ctypes namespace (_ctypes.CFuncPtr).
Split PyInit__ctypes() function into sub-functions and add macros for
readability.
Co-authored-by: Lawrence D’Anna <lawrence_danna@apple.com>
* Add support for macOS 11 and Apple Silicon (aka arm64)
As a side effect of this work use the system copy of libffi on macOS, and remove the vendored copy
* Support building on recent versions of macOS while deploying to older versions
This allows building installers on macOS 11 while still supporting macOS 10.9.
The PyConfig_Read() function now only parses PyConfig.argv arguments
once: PyConfig.parse_argv is set to 2 after arguments are parsed.
Since Python arguments are strippped from PyConfig.argv, parsing
arguments twice would parse the application options as Python
options.
* Rework the PyConfig documentation.
* Fix _testinternalcapi.set_config() error handling.
* SetConfigTests no longer needs parse_argv=0 when restoring the old
configuration.
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
share code between _Py_GetLocaleEncodingObject()
and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
from the current locale encoding with surrogateescape,
rather than using UTF-8.
* bpo-42146: Unify cleanup in subprocess_fork_exec()
Also ignore errors from _enable_gc():
* They are always suppressed by the current code due to a bug.
* _enable_gc() is only used if `preexec_fn != None`, which is unsafe.
* We don't have a good way to handle errors in case we successfully
created a child process.
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* Add a new _locale._get_locale_encoding() function to get the
current locale encoding.
* Modify locale.getpreferredencoding() to use it.
* Remove the _bootlocale module.
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.
Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().
Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
Move private _PyGC_CollectNoFail() to the internal C API.
Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.
Rename functions:
* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
Use _PyLong_GetZero() and _PyLong_GetOne() in Modules/ directory.
_cursesmodule.c and zoneinfo.c are now built with
Py_BUILD_CORE_MODULE macro defined.
Removed the unicodedata.ucnhash_CAPI attribute which was an internal
PyCapsule object. The related private _PyUnicode_Name_CAPI structure
was moved to the internal C API.
Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
Convert the unicodedata extension module to the multiphase
initialization API (PEP 489) and convert the unicodedata.UCD static
type to a heap type.
Co-Authored-By: Mohamed Koubaa <koubaa.m@gmail.com>
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:
* Remove size and state members
* Remove state and self parameters of getcode() and getname()
functions
* Remove global_module_state
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.
* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.
And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
These functions are considered not safe because they suppress all internal errors
and can return wrong result. PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.
Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
Fix memory leak in subprocess.Popen() in case of uid/gid overflow
Also add a test that would catch this leak with `--huntrleaks`.
Alas, the test for `extra_groups` also exposes an inconsistency
in our error reporting: we use a custom ValueError for `extra_groups`,
but propagate OverflowError for `user` and `group`.
It should just be a syscall updating a couple of fields in the kernel side
process info. Confirming, in glibc is appears to be a shim for the setsid
syscall (based on not finding any code implementing anything special for it)
and in uclibc (*much* easier to read) it is clearly just a setsid syscall shim.
A breadcrumb _suggesting_ that it is not allowed on Darwin/macOS comes from
a commit in emacs: https://lists.gnu.org/archive/html/bug-gnu-emacs/2017-04/msg00297.html
but I don't have a way to verify if that is true or not.
As we are not supporting vfork on macOS today I just left a note in a comment.
Using POSIX_CALL() is incorrect since pthread_sigmask() returns
the error number instead of setting errno.
Also handle failure of the first call to pthread_sigmask()
in the parent process, and explain why we don't handle failure
of the second call in a comment.
* bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe
When used to run a new executable image, fork() is not a good choice
for process creation, especially if the parent has a large working set:
fork() needs to copy page tables, which is slow, and may fail on systems
where overcommit is disabled, despite that the child is not going to
touch most of its address space.
Currently, subprocess is capable of using posix_spawn() instead, which
normally provides much better performance. However, posix_spawn() does not
support many of child setup operations exposed by subprocess.Popen().
Most notably, it's not possible to express `close_fds=True`, which
happens to be the default, via posix_spawn(). As a result, most users
can't benefit from faster process creation, at least not without
changing their code.
However, Linux provides vfork() system call, which creates a new process
without copying the address space of the parent, and which is actually
used by C libraries to efficiently implement posix_spawn(). Due to sharing
of the address space and even the stack with the parent, extreme care
is required to use vfork(). At least the following restrictions must hold:
* No signal handlers must execute in the child process. Otherwise, they
might clobber memory shared with the parent, potentially confusing it.
* Any library function called after vfork() in the child must be
async-signal-safe (as for fork()), but it must also not interact with any
library state in a way that might break due to address space sharing
and/or lack of any preparations performed by libraries on normal fork().
POSIX.1 permits to call only execve() and _exit(), and later revisions
remove vfork() specification entirely. In practice, however, almost all
operations needed by subprocess.Popen() can be safely implemented on
Linux.
* Due to sharing of the stack with the parent, the child must be careful
not to clobber local variables that are alive across vfork() call.
Compilers are normally aware of this and take extra care with vfork()
(and setjmp(), which has a similar problem).
* In case the parent is privileged, special attention must be paid to vfork()
use, because sharing an address space across different privilege domains
is insecure[1].
This patch adds support for using vfork() instead of fork() on Linux
when it's possible to do safely given the above. In particular:
* vfork() is not used if credential switch is requested. The reverse case
(simple subprocess.Popen() but another application thread switches
credentials concurrently) is not possible for pure-Python apps because
subprocess.Popen() and functions like os.setuid() are mutually excluded
via GIL. We might also consider to add a way to opt-out of vfork() (and
posix_spawn() on platforms where it might be implemented via vfork()) in
a future PR.
* vfork() is not used if `preexec_fn != None`.
With this change, subprocess will still use posix_spawn() if possible, but
will fallback to vfork() on Linux in most cases, and, failing that,
to fork().
[1] https://ewontfix.com/7
Co-authored-by: Gregory P. Smith [Google LLC] <gps@google.com>
* Add F_SETPIPE_SZ and F_GETPIPE_SZ to fcntl module
* Add pipesize parameter for subprocess.Popen class
This will allow the user to control the size of the pipes.
On linux the default is 64K. When a pipe is full it blocks for writing.
When a pipe is empty it blocks for reading. On processes that are
very fast this can lead to a lot of wasted CPU cycles. On a typical
Linux system the max pipe size is 1024K which is much better.
For high performance-oriented libraries such as xopen it is nice to
be able to set the pipe size.
The workaround without this feature is to use my_popen_process.stdout.fileno() in
conjuction with fcntl and 1031 (value of F_SETPIPE_SZ) to acquire this behavior.
Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.
Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
This suppression is no longer needed in os_closerange_impl, as it just
invokes the internal _Py_closerange implementation. On the other hand,
consumers of _Py_closerange may not have any other reason to suppress
invalid parameter issues, so narrow the scope to here.
close_range(2) should be preferred at all times if it's available, otherwise we'll use closefrom(2) if available with a fallback to fdwalk(3) or plain old loop over fd range in order of most efficient to least.
[note that this version does check for ENOSYS, but currently ignores all other errors]
Automerge-Triggered-By: @pablogsal
Such an API can be used both for os.closerange and subprocess. For the latter, this yields potential improvement for platforms that have fdwalk but wouldn't have used it there. This will prove even more beneficial later for platforms that have close_range(2), as the new API will prefer that over all else if it's available.
The new API is structured to look more like close_range(2), closing from [start, end] rather than the [low, high) of os.closerange().
Automerge-Triggered-By: @gpshead