* Split _Py_InitializeCore_impl() into subfunctions: add multiple pycore_init_xxx() functions
* Preliminary sys.stderr is now set earlier to get an usable
sys.stderr ealier.
* Move code into _Py_Initialize_ReconfigureCore() to be able to call
it from _Py_InitializeCore().
* Split _PyExc_Init(): create a new _PyBuiltins_AddExceptions()
function.
* Call _PyExc_Init() earlier in _Py_InitializeCore_impl()
and new_interpreter() to get working exceptions earlier.
* _Py_ReadyTypes() now returns _PyInitError rather than calling
Py_FatalError().
* Misc code cleanup
* The PyByteArray_Init() and PyByteArray_Fini() functions have been
removed. They did nothing since Python 2.7.4 and Python 3.2.0, were
excluded from the limited API (stable ABI), and were not
documented.
* Move "_PyXXX_Init()" and "_PyXXX_Fini()" declarations from
Include/cpython/pylifecycle.h to
Include/internal/pycore_pylifecycle.h. Replace
"PyAPI_FUNC(TYPE)" with "extern TYPE".
* _PyExc_Init() now returns an error on failure rather than calling
Py_FatalError(). Move macros inside _PyExc_Init() and undefine them
when done. Rewrite macros to make them look more like statement:
add ";" when using them, add "do { ... } while (0)".
* _PyUnicode_Init() now returns a _PyInitError error rather than call
Py_FatalError().
* Move stdin check from _PySys_BeginInit() to init_sys_streams().
* _Py_ReadyTypes() now returns a _PyInitError error rather than
calling Py_FatalError().
The majority of this PR is tediously passing `end_lineno` and `end_col_offset` everywhere. Here are non-trivial points:
* It is not possible to reconstruct end positions in AST "on the fly", some information is lost after an AST node is constructed, so we need two more attributes for every AST node `end_lineno` and `end_col_offset`.
* I add end position information to both CST and AST. Although it may be technically possible to avoid adding end positions to CST, the code becomes more cumbersome and less efficient.
* Since the end position is not known for non-leaf CST nodes while the next token is added, this requires a bit of extra care (see `_PyNode_FinalizeEndPos`). Unless I made some mistake, the algorithm should be linear.
* For statements, I "trim" the end position of suites to not include the terminal newlines and dedent (this seems to be what people would expect), for example in
```python
class C:
pass
pass
```
the end line and end column for the class definition is (2, 8).
* For `end_col_offset` I use the common Python convention for indexing, for example for `pass` the `end_col_offset` is 4 (not 3), so that `[0:4]` gives one the source code that corresponds to the node.
* I added a helper function `ast.get_source_segment()`, to get source text segment corresponding to a given AST node. It is also useful for testing.
An (inevitable) downside of this PR is that AST now takes almost 25% more memory. I think however it is probably justified by the benefits.
* Disable x87 control word for non-x86 targets
On msvc, x87 control word is only available on x86 target. Need to disable it for other targets to prevent compiling problems.
* Include immintrin.h on x86 and x64 only
Immintrin.h is only available on x86 and x64. Need to disable it for other targets to prevent compiling problems.
This change separates the signal handling trigger in the eval loop from the "pending calls" machinery. There is no semantic change and the difference in performance is insignificant.
The change makes both components less confusing. It also eliminates the risk of changes to the pending calls affecting signal handling. This is particularly relevant for some upcoming pending calls changes I have in the works.
Use crypt_r() when available instead of crypt() in the crypt module.
As a nice side effect: This also avoids a memory sanitizer flake as clang msan doesn't know about crypt's internal libc allocated buffer.
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.
"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.
Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".
The documentation contains now strings for operators and punctuation tokens.
Move tupleobject.h code surrounded by "#ifndef Py_LIMITED_API"
to a new Include/cpython/tupleobject.h header file.
Add cpython/ header files to Makefile.pre.in and pythoncore project
of PCbuild.
Fix the following clang warning:
Include/cpython/pystate.h:217:3: warning: redefinition of
typedef 'PyThreadState' is a C11 feature [-Wtypedef-redefinition]
* Move dictobject.h code surrounded by "#ifndef Py_LIMITED_API"
to a new Include/cpython/dictobject.h header file.
* Add PyAPI_FUNC() to _PyDictView_New().
* Reorganize dictobject.h: move views and iterators at the end.
* Move object.h code surrounded by "#ifndef Py_LIMITED_API"
to a new Include/cpython/object.h header file.
* "typedef struct _typeobject PyTypeObject;" is now always defined
in object.h, but if Py_LIMITED_API is not defined,
Include/cpython/object.h also defines the structure.
* Complete the printfunc comment to mention Py_LIMITED_API define.
Fix str.format(), float.__format__() and complex.__format__() methods
for non-ASCII decimal point when using the "n" formatter.
Changes:
* Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires
a _PyUnicodeWriter object for the buffer and a Python str object
for digits.
* Rename FILL() macro to unicode_fill(), convert it to static inline function,
add "assert(0 <= start);" and rework its code.
Add "assert(PyTuple_Check(op));" to _PyTuple_CAST() to check that the
argument is a tuple object in debug mode.
PyTuple_GET_SIZE() now uses _PyTuple_CAST() to get its assertion.
Include/*.h should be the "portable Python API", whereas
Include/cpython/*.h should be the "CPython API": CPython
implementation details.
Changes:
* Create Include/cpython/ subdirectory
* "make install" now creates $prefix/include/cpython and copy
Include/cpython/* to $prefix/include/cpython
* Create Include/cpython/objimpl.h: move objimpl.h code
surrounded by "#ifndef Py_LIMITED_API" to cpython/objimpl.h.
* objimpl.h now includes cpython/objimpl.h
* Windows installer (MSI) now also install Include/ subdirectories:
Include/cpython/ and Include/internal/.
bpo-34523, bpo-35290: C locale coercion now resets the Python
internal "force ASCII" mode. This change fix the filesystem encoding
on FreeBSD CURRENT, which has a new "C.UTF-8" locale, when
the UTF-8 mode is disabled.
Add _Py_ResetForceASCII(): _Py_SetLocaleFromEnv() now calls it.
When iterating using asdl_seq_LEN(), use 'Py_ssize_t' type instead of
'int' for the iterator variable, to avoid downcast on 64-bit platforms.
_Py_asdl_int_seq_new() now also ensures that the index is greater than
or equal to 0.
* Add _PyObject_ASSERT_FROM() and _PyObject_ASSERT_FAILED_MSG()
macros.
* PyObject_GC_Track() now calls _PyObject_ASSERT_FAILED_MSG(),
instead of Py_FatalError(), if the object is already tracked, to
dump more information on error.
* _PyObject_GC_TRACK() no longer checks if the object is already
tracked at runtime, use an assertion instead for best performances;
PyObject_GC_Track() still checks at runtime.
* pycore_object.h now includes pycore_pystate.h.
* Convert _PyObject_GC_TRACK() and _PyObject_GC_UNTRACK() macros to
inline functions.
Partially revert commit 1a6be91e6f,
move back PyGC API from the internal API to the C API:
* _PyGCHead_NEXT(g), _PyGCHead_SET_NEXT(g, p)
* _PyGCHead_PREV(g), _PyGCHead_SET_PREV(g, p)
* _PyGCHead_FINALIZED(g), _PyGCHead_SET_FINALIZED(g)
* _PyGC_FINALIZED(o), _PyGC_SET_FINALIZED(o)
* _PyGC_PREV_MASK_FINALIZED
* _PyGC_PREV_MASK_COLLECTING
* _PyGC_PREV_SHIFT
* _PyGC_PREV_MASK
_PyObject_GC_TRACK(o) and _PyObject_GC_UNTRACK(o) remain in the
internal API.
locale.localeconv() now sets temporarily the LC_CTYPE locale to the
LC_MONETARY locale if the two locales are different and monetary
strings are non-ASCII. This temporary change affects other threads.
Changes:
* locale.localeconv() can now set LC_CTYPE to LC_MONETARY to decode
monetary fields.
* Add LocaleInfo.grouping_buffer: copy localeconv() grouping string
since it can be replaced anytime if a different thread calls
localeconv().
* _Py_GetLocaleconvNumeric() now requires a "struct lconv *"
structure, so locale.localeconv() now longer calls localeconv()
twice. Moreover, the function now requires all arguments to be
non-NULL.
* Rename STATIC_LOCALE_INFO_INIT to LocaleInfo_STATIC_INIT.
* Move _Py_GetLocaleconvNumeric() definition from fileutils.h
to pycore_fileutils.h. pycore_fileutils.h now includes locale.h.
* The _locale module is now built with Py_BUILD_CORE defined.
test_embed.InitConfigTests tests more configuration variables.
Changes:
* InitConfigTests tests more core configuration variables:
* base_exec_prefix
* base_prefix
* exec_prefix
* home
* legacy_windows_fs_encoding
* legacy_windows_stdio
* module_search_path_env
* prefix
* "_testembed init_from_config" tests more variables:
* argv
* warnoptions
* xoptions
* InitConfigTests: add check_global_config(), check_core_config() and
check_main_config() subfunctions to cleanup the code. Move also
constants at the class level (ex: COPY_MAIN_CONFIG).
* Fix _PyCoreConfig_AsDict(): don't set stdio_encoding twice
* Use more macros in _PyCoreConfig_AsDict() and
_PyMainInterpreterConfig_AsDict() to reduce code duplication.
* Other minor cleanups.
* Fix _PyCoreConfig_SetGlobalConfig(): set also Py_FrozenFlag
* Fix _PyCoreConfig_AsDict(): export also xoptions
* Add _Py_GetGlobalVariablesAsDict() and _testcapi.get_global_config()
* test.pythoninfo: dump also global configuration variables
* _testembed now serializes global, core and main configurations
using JSON to reuse _Py_GetGlobalVariablesAsDict(),
_PyCoreConfig_AsDict() and _PyMainInterpreterConfig_AsDict(),
rather than duplicating code.
* test_embed.InitConfigTests now test much more configuration
variables
* Fix _PyMainInterpreterConfig_Copy():
copy 'install_signal_handlers' attribute
* Add _PyMainInterpreterConfig_AsDict()
* Add unit tests on the main interpreter configuration
to test_embed.InitConfigTests
* test.pythoninfo: log also main_config
Datetime macros like PyDate_Check() have two implementations, one using
the C API capsule and one using direct access to the datetime type
symbols defined in _datetimemodule.c. Since the direct access versions
of the macros are only used in _datetimemodule.c, they have been moved
out of "datetime.h" and into _datetimemodule.c.
The _PY_DATETIME_IMPL macro is currently necessary in order to avoid
both duplicate definitions of these macros in _datetimemodule.c and
unnecessary declarations of C API capsule-related macros and varibles in
datetime.h.
Co-Authored-By: Victor Stinner <vstinner@redhat.com>
Adds configure flags for msan and ubsan builds to make it easier to enable.
These also encode the detail that address sanitizer and memory sanitizer
should disable pymalloc.
Define MEMORY_SANITIZER when appropriate at build time and adds workarounds
to existing code to mark things as initialized where the sanitizer is otherwise unable to
determine that. This lets our build succeed under the memory sanitizer. not all tests
pass without sanitizer failures yet but we're in pretty good shape after this.
* ast.h now includes Python-ast.h and node.h
* parsetok.h now includes node.h and grammar.h
* symtable.h now includes Python-ast.h
* Modify asdl_c.py to enhance Python-ast.h:
* Add #ifndef/#define Py_PYTHON_AST_H to be able to include the header
twice
* Add "extern { ... }" for C++
* Undefine "Yield" macro conflicting with winbase.h
* Remove "#undef Yield" from C files, it's now done in Python-ast.h
* Remove now useless includes in C files
* _PyTuple_ITEMS() gives access to the tuple->ob_item field and cast the
first argument to PyTupleObject*. This internal macro is only usable if
Py_BUILD_CORE is defined.
* Replace &PyTuple_GET_ITEM(ob, 0) with _PyTuple_ITEMS(ob).
* Replace PyTuple_GET_ITEM(op, 1) with &_PyTuple_ITEMS(ob)[1].
* All internal header files now require Py_BUILD_CORE or
Py_BUILD_CORE_BUILTIN to be defined.
* _json.c is now compiled with Py_BUILD_CORE_BUILTIN to access
pycore_accu.h header.
* Add an example to Modules/Setup to show how to build _json
as a built-in module; it requires non trivial compiler options.
_testcapimodule.c must not include pycore_pathconfig.h, since it's an
internal header files.
Changes:
* Add _PyCoreConfig_AsDict() function to coreconfig.c.
* Remove pycore_pathconfig.h include from _testcapimodule.h.
* pycore_pathconfig.h now requires Py_BUILD_CORE to be defined.
* _testcapimodule.c compilation now fails if it's built with
Py_BUILD_CORE defined.
* And pycore_lifecycle.h and pycore_pathconfig.h headers to
Include/internal/
* Move Py_BUILD_CORE specific code from coreconfig.h and
pylifecycle.h to pycore_pathconfig.h and pycore_lifecycle.h
* Move _Py_wstrlist_XXX() definitions and _PyPathConfig code
from pycore_state.h to pycore_pathconfig.h
* Move "Init" and "Fini" function definitions from pylifecycle.c to
pycore_lifecycle.h.
The accu.h header is no longer part of the Python C API: it has been
moved to the "internal" headers which are restricted to Python
itself.
Replace #include "accu.h" with #include "pycore_accu.h".
If Py_BUILD_CORE is defined, the PyThreadState_GET() macro access
_PyRuntime which comes from the internal pycore_state.h header.
Public headers must not require internal headers.
Move PyThreadState_GET() and _PyInterpreterState_GET_UNSAFE() from
Include/pystate.h to Include/internal/pycore_state.h, and rename
PyThreadState_GET() to _PyThreadState_GET() there.
The PyThreadState_GET() macro of pystate.h is now redefined when
pycore_state.h is included, to use the fast _PyThreadState_GET().
Changes:
* Add _PyThreadState_GET() macro
* Replace "PyThreadState_GET()->interp" with
_PyInterpreterState_GET_UNSAFE()
* Replace PyThreadState_GET() with _PyThreadState_GET() in internal C
files (compiled with Py_BUILD_CORE defined), but keep
PyThreadState_GET() in the public header files.
* _testcapimodule.c: replace PyThreadState_GET() with
PyThreadState_Get(); the module is not compiled with Py_BUILD_CORE
defined.
* pycore_state.h now requires Py_BUILD_CORE to be defined.
* Remove _PyThreadState_Current
* Replace GET_TSTATE() with PyThreadState_GET()
* Replace GET_INTERP_STATE() with _PyInterpreterState_GET_UNSAFE()
* Replace direct access to _PyThreadState_Current with
PyThreadState_GET()
* Replace _PyThreadState_Current with
_PyRuntime.gilstate.tstate_current
* Rename SET_TSTATE() to _PyThreadState_SET(), name more
consistent with _PyThreadState_GET()
* Update outdated comments
Make _PySys_AddXOptionWithError() and _PySys_AddWarnOptionWithError()
functions private again. They are no longer needed to initialize Python:
_PySys_EndInit() is now responsible to add these options instead.
Moreover, PySys_AddWarnOptionUnicode() now clears the exception on
failure if possible.
* bpo-34523, bpo-34403: Fix config_init_fs_encoding(): it now uses
ASCII if _Py_GetForceASCII() is true.
* Fix a regression of commit b2457efc78.
* Fix also a memory leak: get_locale_encoding() already allocates
memory, no need to duplicate the string.
Configuring python with ./configure --with-pydebug CFLAGS="-D COUNT_ALLOCS -O0"
makes "make smelly" fail as some symbols were being exported without the "Py_" or
"_Py" prefixes.
* Convert PyObject_INIT() and PyObject_INIT_VAR() macros to static
inline functions.
* Fix usage of these functions: cast to PyObject* or PyVarObject*.
_PyTraceMalloc_NewReference() is now called by _Py_NewReference(), so
move its definition to object.h. Moreover, define it even if
Py_LIMITED_API is defined, since _Py_NewReference() is also exposed
even if Py_LIMITED_API is defined.
Changes:
* Add _PyObject_AssertFailed() function.
* Add _PyObject_ASSERT() and _PyObject_ASSERT_WITH_MSG() macros.
* gc_decref(): replace assert() with _PyObject_ASSERT_WITH_MSG() to
dump the faulty object if the assertion fails.
_PyObject_AssertFailed() calls:
* _PyMem_DumpTraceback(): try to log the traceback where the object
memory has been allocated if tracemalloc is enabled.
* _PyObject_Dump(): log repr(obj).
* Py_FatalError(): log the current Python traceback.
_PyObject_AssertFailed() uses _PyObject_IsFreed() heuristic to check
if the object memory has been freed by a debug hook on Python memory
allocators.
Initial patch written by David Malcolm.
Co-Authored-By: David Malcolm <dmalcolm@redhat.com>
* Add Py_STATIC_INLINE() macro to declare a "static inline" function.
If the compiler supports it, try to always inline the function even if no
optimization level was specified.
* Modify pydtrace.h to use Py_STATIC_INLINE() when WITH_DTRACE is
not defined.
* Add an unit test on Py_DECREF() to make sure that
_Py_NegativeRefcount() reports the correct filename.
* Modify object.h to ensure that pymem.h is included,
to get _Py_tracemalloc_config variable.
* Move _PyTraceMalloc_XXX() functions to tracemalloc.h,
they need PyObject type. Break circular dependency between pymem.h
and object.h.
tracemalloc now tries to update the traceback when an object is
reused from a "free list" (optimization for faster object creation,
used by the builtin list type for example).
Changes:
* Add _PyTraceMalloc_NewReference() function which tries to update
the Python traceback of a Python object.
* _Py_NewReference() now calls _PyTraceMalloc_NewReference().
* Add an unit test.
_PyObject_Dump() now uses an heuristic to check if the object memory
has been freed: log "<freed object>" in that case.
The heuristic rely on the debug hooks on Python memory allocators
which fills the memory with DEADBYTE (0xDB) when memory is
deallocated. Use PYTHONMALLOC=debug to always enable these debug
hooks.
* Revert "bpo-34589: Add -X coerce_c_locale command line option (GH-9378)"
This reverts commit dbdee0073c.
* Revert "bpo-34589: C locale coercion off by default (GH-9073)"
This reverts commit 7a0791b699.
* Revert "bpo-34589: Make _PyCoreConfig.coerce_c_locale private (GH-9371)"
This reverts commit 188ebfa475.
The C accelerated _elementtree module now initializes hash randomization
salt from _Py_HashSecret instead of libexpat's default CPRNG.
Signed-off-by: Christian Heimes <christian@python.org>
https://bugs.python.org/issue34623
Py_Initialize() and Py_Main() cannot enable the C locale coercion
(PEP 538) anymore: it is always disabled. It can now only be enabled
by the Python program ("python3).
test_embed: get_filesystem_encoding() doesn't have to set PYTHONUTF8
nor PYTHONCOERCECLOCALE, these variables are already set in the
parent.
_PyCoreConfig:
* Rename coerce_c_locale to _coerce_c_locale
* Rename coerce_c_locale_warn to _coerce_c_locale_warn
These fields are now private (name prefixed by "_").
When os.fork() is called (on platforms that support it) all threads but the current one are destroyed in the child process. Consequently we must ensure that all but the associated interpreter are likewise destroyed. The main interpreter is critical for runtime operation, so we must ensure that fork only happens in the main interpreter.
https://bugs.python.org/issue34651
* Add _testcapi.get_coreconfig() to get the _PyCoreConfig of the
interpreter
* test.pythoninfo now gets the core configuration using
_testcapi.get_coreconfig()
Use the core configuration of the interpreter, rather
than using global configuration variables. For example, replace
Py_QuietFlag with core_config->quiet.
* Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors
default value is now NULL: initfsencoding() set them
during Python initialization.
* Document how Python chooses the filesystem encoding and error
handler.
* Add an assertion to _PyCoreConfig_Read().
Add support for the "surrogatepass" error handler in
PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault()
for the UTF-8 encoding.
Changes:
* _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the
surrogatepass error handler (_Py_ERROR_SURROGATEPASS).
* _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use
the _Py_error_handler enum instead of "int surrogateescape" to pass
the error handler. These functions now return -3 if the error
handler is unknown.
* Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx()
in test_codecs.
* Rename get_error_handler() to _Py_GetErrorHandler() and expose it
as a private function.
* _freeze_importlib doesn't need config.filesystem_errors="strict"
workaround anymore.
_PyCoreConfig_Read() is now responsible to choose the filesystem
encoding and error handler. Using Py_Main(), the encoding is now
chosen even before calling Py_Initialize().
_PyCoreConfig.filesystem_encoding is now the reference, instead of
Py_FileSystemDefaultEncoding, for the Python filesystem encoding.
Changes:
* Add filesystem_encoding and filesystem_errors to _PyCoreConfig
* _PyCoreConfig_Read() now reads the locale encoding for the file
system encoding.
* PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize()
now use the interpreter configuration rather than
Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors
global configuration variables.
* Add _Py_SetFileSystemEncoding() and _Py_ClearFileSystemEncoding()
private functions to only modify Py_FileSystemDefaultEncoding and
Py_FileSystemDefaultEncodeErrors in coreconfig.c.
* _Py_CoerceLegacyLocale() now takes an int rather than
_PyCoreConfig for the warning.
On HP-UX with C or POSIX locale, sys.getfilesystemencoding() now returns
"ascii" instead of "roman8" (when the UTF-8 Mode is disabled and the C locale
is not coerced).
nl_langinfo(CODESET) announces "roman8" whereas it uses the Latin1
encoding in practice.
bpo-31650, bpo-34170: Replace _Py_CheckHashBasedPycsMode with
_PyCoreConfig._check_hash_pycs_mode. Modify PyInit__imp() and
zipimport to get the parameter from the current interpreter core
configuration.
Remove Include/internal/import.h file.
* Add Include/coreconfig.h
* Move config_*() and _PyCoreConfig_*() functions from Modules/main.c
to a new Python/coreconfig.c file.
* Inline _Py_ReadHashSeed() into config_init_hash_seed()
* Move global configuration variables to coreconfig.c