When the loop in the pymain_read_conf function in this same file
calls pymain_init_cmdline_argv() a 2nd time, the pymain->command
buffer of wchar_t is overriden and the previously allocated memory
is never freed.
The majority of this PR is tediously passing `end_lineno` and `end_col_offset` everywhere. Here are non-trivial points:
* It is not possible to reconstruct end positions in AST "on the fly", some information is lost after an AST node is constructed, so we need two more attributes for every AST node `end_lineno` and `end_col_offset`.
* I add end position information to both CST and AST. Although it may be technically possible to avoid adding end positions to CST, the code becomes more cumbersome and less efficient.
* Since the end position is not known for non-leaf CST nodes while the next token is added, this requires a bit of extra care (see `_PyNode_FinalizeEndPos`). Unless I made some mistake, the algorithm should be linear.
* For statements, I "trim" the end position of suites to not include the terminal newlines and dedent (this seems to be what people would expect), for example in
```python
class C:
pass
pass
```
the end line and end column for the class definition is (2, 8).
* For `end_col_offset` I use the common Python convention for indexing, for example for `pass` the `end_col_offset` is 4 (not 3), so that `[0:4]` gives one the source code that corresponds to the node.
* I added a helper function `ast.get_source_segment()`, to get source text segment corresponding to a given AST node. It is also useful for testing.
An (inevitable) downside of this PR is that AST now takes almost 25% more memory. I think however it is probably justified by the benefits.
Fix a NULL pointer deref in ssl module. The cert parser did not handle CRL
distribution points with empty DP or URI correctly. A malicious or buggy
certificate can result into segfault.
Signed-off-by: Christian Heimes <christian@python.org>
https://bugs.python.org/issue35746
Previously, calling the strftime() method on a datetime object with a
trailing '%' in the format string would result in an exception. However,
this only occured when the datetime C module was being used; the python
implementation did not match this behavior. Datetime is now PEP-399
compliant, and will not throw an exception on a trailing '%'.
Use the fast call convention for math functions atan2(),
copysign(), hypot() and remainder() and inline unpacking
arguments. This sped up them by 1.3--2.5 times.
This change separates the signal handling trigger in the eval loop from the "pending calls" machinery. There is no semantic change and the difference in performance is insignificant.
The change makes both components less confusing. It also eliminates the risk of changes to the pending calls affecting signal handling. This is particularly relevant for some upcoming pending calls changes I have in the works.
Use _PyArg_CheckPositional() and inlined code instead of
PyArg_UnpackTuple() and _PyArg_UnpackStack() if all parameters
are positional and use the "object" converter.
Fix memory leaks in asyncio ProactorEventLoop on overlapped operation
failures.
Changes:
* Implement the tp_traverse slot in the _overlapped.Overlapped type
to help to break reference cycles and identify referrers in the
garbage collector.
* Always clear overlapped on failure: not only set type to
TYPE_NOT_STARTED, but release also resources.
As in title, expose C `raise` function as `raise_function` in `signal` module. Also drop existing `raise_signal` in `_testcapi` module and replace all usages with new function.
https://bugs.python.org/issue35568
Add Clang Memory Sanitizer build instrumentation to work around
false positives from the socket and time modules as well as skipping
a couple test_faulthandler tests.
Use crypt_r() when available instead of crypt() in the crypt module.
As a nice side effect: This also avoids a memory sanitizer flake as clang msan doesn't know about crypt's internal libc allocated buffer.
* bpo-32492: 2.5x speed up in namedtuple attribute access using C fast path
* Add News entry
* fixup! bpo-32492: 2.5x speed up in namedtuple attribute access using C fast path
* Check for tuple in the __get__ of the new descriptor and don't cache the descriptor itself
* Don't inherit from property. Implement GC methods to handle __doc__
* Add a test for the docstring substitution in descriptors
* Update NEWS entry to reflect time against 3.7 branch
* Simplify implementation with argument clinic, better error messages, only __new__
* Use positional-only parameters for the __new__
* Use PyTuple_GET_SIZE and PyTuple_GET_ITEM to tighter the implementation of tuplegetterdescr_get
* Implement __set__ to make tuplegetter a data descriptor
* Use Py_INCREF now that we inline PyTuple_GetItem
* Apply the valid_index() function, saving one test
* Move Py_None test out of the critical path.
* Fix test_mktime on AIX by adding code to get mktime to behave the
same way as it does on other *nix systems
* Fix test_pthread_getcpuclickid in AIX by adjusting the test case
expectations when running on AIX in 32-bit mode
Patch by Michael Felt.
This test case needs "signed short" bitfields, but the
IBM XLC compiler (on AIX) does not support this.
Skip the code and test when AIX and XLC are used.
Use __xlc__ as identifier to detect the XLC compiler.
Include <pyconfig.h> ealier in Modules/expat/xmltok.c to define
properly _POSIX_C_SOURCE. Python defines _POSIX_C_SOURCE as 200809L,
whereas <features.h> (included indirectly by <string.h>) defines
_POSIX_C_SOURCE as 199506L.
The length check for AF_ALG salg_name and salg_type had a off-by-one
error. The code assumed that both values are not necessarily NULL
terminated. However the Kernel code for alg_bind() ensures that the last
byte of both strings are NULL terminated.
Signed-off-by: Christian Heimes <christian@python.org>
In _localemodule.c and selectmodule.c, remove dead code that would
cause double decrefs if run.
In addition, replace PyList_SetItem() with PyList_SET_ITEM() in cases
where a new list is populated and there is no possibility of an error.
In addition, check if the list changed size in the loop in array_array_fromlist().
select() calls are retried on EINTR (per PEP 475). However, if a
timeout was provided and the deadline has passed after running the
signal handlers, rlist, wlist and xlist should be cleared since select(2)
left them unmodified.
* bpo-31572: Get rid of PyObject_HasAttrString() in ctypes.
* Fix error handling for _pack_.
* Don't silence errors when look up in a dict.
* Use _PyObject_LookupAttrId().
* More changes.
* PyInit_time() now returns NULL if an exception is raised.
* Rename PyInit_timezone() to init_timezone(). "PyInit_" prefix is
a special prefix for function initializing a module.
init_timezone() doesn't initialize a module and the function is not
exported.
get_gmtoff() now returns time_t instead of int to fix the following
Visual Studio warning:
Modules\timemodule.c(1183): warning C4244: 'return':
conversion from 'time_t' to 'int', possible loss of data
Replace strncpy() with memcpy() in call_readline() to fix the
following warning, the NUL byte is written manually just after:
Modules/readline.c: In function ‘call_readline’:
Modules/readline.c:1303:9: warning: ‘strncpy’ output truncated before
terminating nul copying as many bytes from a string as its length
[-Wstringop-truncation]
strncpy(p, q, n);
^~~~~~~~~~~~~~~~
Modules/readline.c:1279:9: note: length computed here
n = strlen(p);
^~~~~~~~~
Fix invalid function cast warnings with gcc 8
for method conventions different from METH_NOARGS, METH_O and
METH_VARARGS excluding Argument Clinic generated code.
Fix invalid function cast warnings with gcc 8
for method conventions different from METH_NOARGS, METH_O and
METH_VARARGS in Argument Clinic generated code.
os_read_impl() now also truncates the size to _PY_READ_MAX
on macOS, to avoid to allocate a larger buffer even if _Py_read() is
limited to _PY_READ_MAX bytes (ex: INT_MAX on macOS).
Explicit cast a pointer difference (intptr_t) to int to fix
two warnings on 64-bit Windows:
Modules\pyexpat.c(1181): warning C4244: 'initializing':
conversion from '__int64' to 'int', possible loss of data
Modules\pyexpat.c(1192): warning C4244: 'initializing':
conversion from '__int64' to 'int', possible loss of data
Fixed the following compiler warning in multibytecodec.c:
warning C4244: '=': conversion from 'Py_ssize_t'
to 'unsigned char', possible loss of data
Cast Py_ssize_t to unsigned char: the maximum value is checked
on the previous line.
Don't pass complex expressions but regular variables to Python
macros.
* _datetimemodule.c: split single large "if" into two "if"
in date_new(), time_new() and datetime_new().
* _pickle.c, load_extension(): flatten complex "if" expression into
more regular C code.
* _ssl.c: addbool() now uses a temporary bool_obj to only evaluate
the value once.
* weakrefobject.c: replace "Py_INCREF(result = proxy);"
with "result = proxy; Py_INCREF(result);"
* Add _PyObject_ASSERT_FROM() and _PyObject_ASSERT_FAILED_MSG()
macros.
* PyObject_GC_Track() now calls _PyObject_ASSERT_FAILED_MSG(),
instead of Py_FatalError(), if the object is already tracked, to
dump more information on error.
* _PyObject_GC_TRACK() no longer checks if the object is already
tracked at runtime, use an assertion instead for best performances;
PyObject_GC_Track() still checks at runtime.
* pycore_object.h now includes pycore_pystate.h.
* Convert _PyObject_GC_TRACK() and _PyObject_GC_UNTRACK() macros to
inline functions.
Fixes assertion failures in _datetimemodule.c
introduced in the previous fix (see bpo-31752).
Rather of trying to handle an int subclass as exact int,
let it to use overridden special methods, but check the
result of divmod().
locale.localeconv() now sets temporarily the LC_CTYPE locale to the
LC_MONETARY locale if the two locales are different and monetary
strings are non-ASCII. This temporary change affects other threads.
Changes:
* locale.localeconv() can now set LC_CTYPE to LC_MONETARY to decode
monetary fields.
* Add LocaleInfo.grouping_buffer: copy localeconv() grouping string
since it can be replaced anytime if a different thread calls
localeconv().
* _Py_GetLocaleconvNumeric() now requires a "struct lconv *"
structure, so locale.localeconv() now longer calls localeconv()
twice. Moreover, the function now requires all arguments to be
non-NULL.
* Rename STATIC_LOCALE_INFO_INIT to LocaleInfo_STATIC_INIT.
* Move _Py_GetLocaleconvNumeric() definition from fileutils.h
to pycore_fileutils.h. pycore_fileutils.h now includes locale.h.
* The _locale module is now built with Py_BUILD_CORE defined.
test_embed.InitConfigTests tests more configuration variables.
Changes:
* InitConfigTests tests more core configuration variables:
* base_exec_prefix
* base_prefix
* exec_prefix
* home
* legacy_windows_fs_encoding
* legacy_windows_stdio
* module_search_path_env
* prefix
* "_testembed init_from_config" tests more variables:
* argv
* warnoptions
* xoptions
* InitConfigTests: add check_global_config(), check_core_config() and
check_main_config() subfunctions to cleanup the code. Move also
constants at the class level (ex: COPY_MAIN_CONFIG).
* Fix _PyCoreConfig_AsDict(): don't set stdio_encoding twice
* Use more macros in _PyCoreConfig_AsDict() and
_PyMainInterpreterConfig_AsDict() to reduce code duplication.
* Other minor cleanups.
* Fix _PyCoreConfig_SetGlobalConfig(): set also Py_FrozenFlag
* Fix _PyCoreConfig_AsDict(): export also xoptions
* Add _Py_GetGlobalVariablesAsDict() and _testcapi.get_global_config()
* test.pythoninfo: dump also global configuration variables
* _testembed now serializes global, core and main configurations
using JSON to reuse _Py_GetGlobalVariablesAsDict(),
_PyCoreConfig_AsDict() and _PyMainInterpreterConfig_AsDict(),
rather than duplicating code.
* test_embed.InitConfigTests now test much more configuration
variables
* Fix _PyMainInterpreterConfig_Copy():
copy 'install_signal_handlers' attribute
* Add _PyMainInterpreterConfig_AsDict()
* Add unit tests on the main interpreter configuration
to test_embed.InitConfigTests
* test.pythoninfo: log also main_config
If tracemalloc is not tracing Python memory allocations,
_PyMem_DumpTraceback() now suggests to enable tracemalloc
to get the traceback where the memory block has been allocated.
Datetime macros like PyDate_Check() have two implementations, one using
the C API capsule and one using direct access to the datetime type
symbols defined in _datetimemodule.c. Since the direct access versions
of the macros are only used in _datetimemodule.c, they have been moved
out of "datetime.h" and into _datetimemodule.c.
The _PY_DATETIME_IMPL macro is currently necessary in order to avoid
both duplicate definitions of these macros in _datetimemodule.c and
unnecessary declarations of C API capsule-related macros and varibles in
datetime.h.
Co-Authored-By: Victor Stinner <vstinner@redhat.com>
Adds configure flags for msan and ubsan builds to make it easier to enable.
These also encode the detail that address sanitizer and memory sanitizer
should disable pymalloc.
Define MEMORY_SANITIZER when appropriate at build time and adds workarounds
to existing code to mark things as initialized where the sanitizer is otherwise unable to
determine that. This lets our build succeed under the memory sanitizer. not all tests
pass without sanitizer failures yet but we're in pretty good shape after this.
* ast.h now includes Python-ast.h and node.h
* parsetok.h now includes node.h and grammar.h
* symtable.h now includes Python-ast.h
* Modify asdl_c.py to enhance Python-ast.h:
* Add #ifndef/#define Py_PYTHON_AST_H to be able to include the header
twice
* Add "extern { ... }" for C++
* Undefine "Yield" macro conflicting with winbase.h
* Remove "#undef Yield" from C files, it's now done in Python-ast.h
* Remove now useless includes in C files
* _PyTuple_ITEMS() gives access to the tuple->ob_item field and cast the
first argument to PyTupleObject*. This internal macro is only usable if
Py_BUILD_CORE is defined.
* Replace &PyTuple_GET_ITEM(ob, 0) with _PyTuple_ITEMS(ob).
* Replace PyTuple_GET_ITEM(op, 1) with &_PyTuple_ITEMS(ob)[1].
* All internal header files now require Py_BUILD_CORE or
Py_BUILD_CORE_BUILTIN to be defined.
* _json.c is now compiled with Py_BUILD_CORE_BUILTIN to access
pycore_accu.h header.
* Add an example to Modules/Setup to show how to build _json
as a built-in module; it requires non trivial compiler options.
This typo doesn't affect the result because wrong bits are discarded
on implicit conversion to unsigned char, but it trips UBSan
with -fsanitize=implicit-integer-truncation.
https://bugs.python.org/issue35194
_testcapimodule.c must not include pycore_pathconfig.h, since it's an
internal header files.
Changes:
* Add _PyCoreConfig_AsDict() function to coreconfig.c.
* Remove pycore_pathconfig.h include from _testcapimodule.h.
* pycore_pathconfig.h now requires Py_BUILD_CORE to be defined.
* _testcapimodule.c compilation now fails if it's built with
Py_BUILD_CORE defined.
Two kind of mistakes:
1. Missed space. After concatenating there is no space between words.
2. Missed comma. Causes unintentional concatenating in a list of strings.
Some methods in the os module can accept path-like objects. This is documented in the general documentation but not in the function docstrings. To keep both in sync, the docstrings need to be updated to reflect that path-like objects are also accepted.
This implements getstate and setstate for the cjkcodecs multibyte incremental encoders/decoders, primarily to fix issues with seek/tell.
The encoder getstate/setstate is slightly tricky as the "state" is pending bytes + MultibyteCodec_State but only an integer can be returned. The approach I've taken is to encode this data into a long, similar to how .tell() encodes a "cookie_type" as a long.
https://bugs.python.org/issue33578
* And pycore_lifecycle.h and pycore_pathconfig.h headers to
Include/internal/
* Move Py_BUILD_CORE specific code from coreconfig.h and
pylifecycle.h to pycore_pathconfig.h and pycore_lifecycle.h
* Move _Py_wstrlist_XXX() definitions and _PyPathConfig code
from pycore_state.h to pycore_pathconfig.h
* Move "Init" and "Fini" function definitions from pylifecycle.c to
pycore_lifecycle.h.
The accu.h header is no longer part of the Python C API: it has been
moved to the "internal" headers which are restricted to Python
itself.
Replace #include "accu.h" with #include "pycore_accu.h".
If Py_BUILD_CORE is defined, the PyThreadState_GET() macro access
_PyRuntime which comes from the internal pycore_state.h header.
Public headers must not require internal headers.
Move PyThreadState_GET() and _PyInterpreterState_GET_UNSAFE() from
Include/pystate.h to Include/internal/pycore_state.h, and rename
PyThreadState_GET() to _PyThreadState_GET() there.
The PyThreadState_GET() macro of pystate.h is now redefined when
pycore_state.h is included, to use the fast _PyThreadState_GET().
Changes:
* Add _PyThreadState_GET() macro
* Replace "PyThreadState_GET()->interp" with
_PyInterpreterState_GET_UNSAFE()
* Replace PyThreadState_GET() with _PyThreadState_GET() in internal C
files (compiled with Py_BUILD_CORE defined), but keep
PyThreadState_GET() in the public header files.
* _testcapimodule.c: replace PyThreadState_GET() with
PyThreadState_Get(); the module is not compiled with Py_BUILD_CORE
defined.
* pycore_state.h now requires Py_BUILD_CORE to be defined.
* Fix potential division by zero in BZ2_Malloc()
* Avoid division by zero in PyLzma_Malloc()
* Avoid division by zero and integer overflow in PyZlib_Malloc()
Reported by Svace static analyzer.
Replace assert() with _PyObject_ASSERT() in Modules/gcmodule.c
to dump the faulty object on assertion failure to ease debugging.
Fix also indentation of a large comment.
Initial patch written by David Malcolm.
Co-Authored-By: David Malcolm <dmalcolm@redhat.com>
Declare functions with EXTINLINE:
* mpd_del()
* mpd_uint_zero()
* mpd_qresize()
* mpd_qresize_zero()
* mpd_minalloc()
These functions are implemented with "inline" or "ALWAYS_INLINE", but
declared without inline which cause linker error on Visual Studio in
Debug mode when using /Ob1.
_PyTraceMalloc_NewReference() is now called by _Py_NewReference(), so
move its definition to object.h. Moreover, define it even if
Py_LIMITED_API is defined, since _Py_NewReference() is also exposed
even if Py_LIMITED_API is defined.
Changes:
* Add _PyObject_AssertFailed() function.
* Add _PyObject_ASSERT() and _PyObject_ASSERT_WITH_MSG() macros.
* gc_decref(): replace assert() with _PyObject_ASSERT_WITH_MSG() to
dump the faulty object if the assertion fails.
_PyObject_AssertFailed() calls:
* _PyMem_DumpTraceback(): try to log the traceback where the object
memory has been allocated if tracemalloc is enabled.
* _PyObject_Dump(): log repr(obj).
* Py_FatalError(): log the current Python traceback.
_PyObject_AssertFailed() uses _PyObject_IsFreed() heuristic to check
if the object memory has been freed by a debug hook on Python memory
allocators.
Initial patch written by David Malcolm.
Co-Authored-By: David Malcolm <dmalcolm@redhat.com>
* Add Py_STATIC_INLINE() macro to declare a "static inline" function.
If the compiler supports it, try to always inline the function even if no
optimization level was specified.
* Modify pydtrace.h to use Py_STATIC_INLINE() when WITH_DTRACE is
not defined.
* Add an unit test on Py_DECREF() to make sure that
_Py_NegativeRefcount() reports the correct filename.
tracemalloc now tries to update the traceback when an object is
reused from a "free list" (optimization for faster object creation,
used by the builtin list type for example).
Changes:
* Add _PyTraceMalloc_NewReference() function which tries to update
the Python traceback of a Python object.
* _Py_NewReference() now calls _PyTraceMalloc_NewReference().
* Add an unit test.
* Use _PyUnicode_Copy in sanitize_isoformat_str
* Use repr in fromisoformat error message
This reverses commit 67b74a98b2 per Serhiy Storchaka's suggestion:
I suggested to use %R in the error message because including the raw
string can be confusing in the case of empty string, or string
containing trailing whitespaces, invisible or unprintable characters.
We agree that it is better to change both the C and pure Python versions
to use repr.
* Retain non-sanitized dtstr for error printing
This does not create an extra string, it just holds on to a reference to
the original input string for purposes of creating the error message.
* PEP 7 fixes to from_isoformat
* Separate handling of Unicode and other errors
In the initial implementation, errors other than encoding errors would
both raise an error indicating an invalid format, which would not be
true for errors like MemoryError.
* Drop needs_decref from _sanitize_isoformat_str
Instead _sanitize_isoformat_str returns a new reference, even to the
original string.
Raise ValueError OverflowError in case of a negative
_length_ in a ctypes.Array subclass. Also raise TypeError
instead of AttributeError for non-integer _length_.
Co-authored-by: Oren Milman <orenmn@gmail.com>
path_error() uses GetLastError() on Windows, but some os functions
are implemented via CRT APIs which report errors via errno.
This may result in raising OSError with invalid error code (such
as zero).
Introduce posix_path_error() function and use it where appropriate.
If buffering=1 is specified for open() in binary mode, it is silently
treated as buffering=-1 (i.e., the default buffer size).
Coupled with the fact that line buffering is always supported in Python 2,
such behavior caused several issues (e.g., bpo-10344, bpo-21332).
Warn that line buffering is not supported if open() is called with
binary mode and buffering=1.
It is now guarantied that children of xml.etree.ElementTree.Element
are Elements (at least in C implementation). Previously methods
__setitem__(), __setstate__() and __deepcopy__() could be used for
adding non-Element children.