Commit Graph

6334 Commits

Author SHA1 Message Date
Pablo Galindo 372d705d95
bpo-33234 Improve list() pre-sizing for inputs with known lengths (GH-9846)
The list() constructor isn't taking full advantage of known input
lengths or length hints. This commit makes the constructor
pre-size and not over-allocate when the input size is known (the
input collection implements __len__). One on the main advantages is
that this provides 12% difference in memory savings due to the difference
between overallocating and allocating exactly the input size.

For efficiency purposes and to avoid a performance regression for small
generators and collections, the size of the input object is calculated using
__len__ and not __length_hint__, as the later is considerably slower.
2018-10-28 20:16:26 +00:00
Pablo Galindo 49c75a8086
bpo-35064 prefix smelly symbols that appear with COUNT_ALLOCS with _Py_ (GH-10152)
Configuring python with ./configure --with-pydebug CFLAGS="-D COUNT_ALLOCS -O0"
makes "make smelly" fail as some symbols were being exported without the "Py_" or
"_Py" prefixes.
2018-10-28 15:02:17 +00:00
jdemeyer aeb1be5868 bpo-34751: improved hash function for tuples (GH-9471) 2018-10-27 20:06:38 -04:00
Victor Stinner 50fe3f8913
bpo-9263: _PyXXX_CheckConsistency() use _PyObject_ASSERT() (GH-10108)
Use _PyObject_ASSERT() in:

* _PyDict_CheckConsistency()
* _PyType_CheckConsistency()
* _PyUnicode_CheckConsistency()

_PyObject_ASSERT() dumps the faulty object if the assertion fails
to help debugging.
2018-10-26 18:47:15 +02:00
Victor Stinner 0862505a03
bpo-9263: Use _PyObject_ASSERT() in typeobject.c (GH-10111)
Replace assert() with _PyObject_ASSERT() in Objects/typeobject.c
to dump the faulty object on assertion failure to ease debugging.
2018-10-26 18:39:11 +02:00
Victor Stinner 24702044af
bpo-9263: Use _PyObject_ASSERT() in object.c (GH-10110)
Replace assert() with _PyObject_ASSERT() in Objects/object.c to dump
the faulty object on assertion failure to ease debugging.
2018-10-26 17:16:37 +02:00
Victor Stinner b4435e20a9
bpo-35059: Convert PyObject_INIT() to function (GH-10077)
* Convert PyObject_INIT() and PyObject_INIT_VAR() macros to static
  inline functions.
* Fix usage of these functions: cast to PyObject* or PyVarObject*.
2018-10-26 14:35:00 +02:00
Victor Stinner 3ec9af75f6
bpo-9263: _Py_NegativeRefcount() use _PyObject_AssertFailed() (GH-10109)
_Py_NegativeRefcount() now uses _PyObject_AssertFailed() to dump the
object to help debugging.
2018-10-26 02:12:34 +02:00
Victor Stinner 626bff8568
bpo-9263: Dump Python object on GC assertion failure (GH-10062)
Changes:

* Add _PyObject_AssertFailed() function.
* Add _PyObject_ASSERT() and _PyObject_ASSERT_WITH_MSG() macros.
* gc_decref(): replace assert() with _PyObject_ASSERT_WITH_MSG() to
  dump the faulty object if the assertion fails.

_PyObject_AssertFailed() calls:

* _PyMem_DumpTraceback(): try to log the traceback where the object
  memory has been allocated if tracemalloc is enabled.
* _PyObject_Dump(): log repr(obj).
* Py_FatalError(): log the current Python traceback.

_PyObject_AssertFailed() uses _PyObject_IsFreed() heuristic to check
if the object memory has been freed by a debug hook on Python memory
allocators.

Initial patch written by David Malcolm.

Co-Authored-By: David Malcolm <dmalcolm@redhat.com>
2018-10-25 17:31:10 +02:00
Victor Stinner 18618e652c
bpo-35059: Add Py_STATIC_INLINE() macro (GH-10093)
* Add Py_STATIC_INLINE() macro to declare a "static inline" function.
  If the compiler supports it, try to always inline the function even if no
  optimization level was specified.
* Modify pydtrace.h to use Py_STATIC_INLINE() when WITH_DTRACE is
  not defined.
* Add an unit test on Py_DECREF() to make sure that
  _Py_NegativeRefcount() reports the correct filename.
2018-10-25 17:28:11 +02:00
Victor Stinner 9e00e80e21
bpo-35053: Enhance tracemalloc to trace free lists (GH-10063)
tracemalloc now tries to update the traceback when an object is
reused from a "free list" (optimization for faster object creation,
used by the builtin list type for example).

Changes:

* Add _PyTraceMalloc_NewReference() function which tries to update
  the Python traceback of a Python object.
* _Py_NewReference() now calls _PyTraceMalloc_NewReference().
* Add an unit test.
2018-10-25 13:31:16 +02:00
Serhiy Storchaka c46db9232f
bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(). (GH-2599)
They no longer cache the wchar_t* representation of string objects.
2018-10-23 22:58:24 +03:00
Victor Stinner 82af0b63b0
bpo-9263: _PyObject_Dump() detects freed memory (GH-10061)
_PyObject_Dump() now uses an heuristic to check if the object memory
has been freed: log "<freed object>" in that case.

The heuristic rely on the debug hooks on Python memory allocators
which fills the memory with DEADBYTE (0xDB) when memory is
deallocated. Use PYTHONMALLOC=debug to always enable these debug
hooks.
2018-10-23 17:39:40 +02:00
Eric Lippert 5a95ba29da Fix issue 34551 - remove redundant store (#9009)
The assignment of i/2 to nk is redundant because on this code path, nk is already the size of the dictionary, and i is already twice the size of the dictionary. I've replaced the store with an assertion that i/2 is nk.
2018-10-22 16:52:46 +01:00
Serhiy Storchaka 2c2044e789
bpo-34984: Improve error messages for bytes and bytearray constructors. (GH-9874) 2018-10-21 15:29:12 +03:00
Serhiy Storchaka 914f9a078f
bpo-34973: Fix crash in bytes constructor. (GH-9841)
Constructing bytes from mutating list could cause a crash.
2018-10-21 15:25:53 +03:00
Sergey Fedoseev a5259fb05d bpo-34574: Prevent OrderedDict iterators from exhaustion during pickling. (GH-9051) 2018-10-20 08:20:39 +03:00
Sergey Fedoseev 6395844e6a bpo-34573: Simplify __reduce__() of set and dict iterators. (GH-9050)
Simplify the pickling of set and dictionary objects iterators by consuming
the iterator into a list with PySequence_List.
2018-10-20 01:43:33 +01:00
Serhiy Storchaka 6f17e51345 bpo-33712: OrderedDict only creates od_fast_nodes cache if needed (GH-7349) 2018-10-20 01:27:45 +02:00
Serhiy Storchaka b2e2025941 bpo-33073: Rework int.as_integer_ratio() implementation (GH-9303)
* Simplify the C code.
* Simplify tests and make them more strict and robust.
* Add references in the documentation.
2018-10-19 23:46:31 +02:00
Serhiy Storchaka e890421e33
bpo-34974: Do not replace unexpected errors in bytes() and bytearray(). (GH-9852)
bytes and bytearray constructors converted unexpected exceptions
(e.g. MemoryError and KeyboardInterrupt) to TypeError.
2018-10-15 00:02:57 +03:00
Zackery Spytz a4b48f194a bpo-34940: Fix the error handling in _check_for_legacy_statements(). (GH-9764) 2018-10-12 11:20:59 +03:00
Raymond Hettinger f1aa8aed4a
Micro-optimize list index range checks (GH-9784) 2018-10-10 20:37:28 -07:00
Emanuele Gaifas fc8205cb4b Add missing closing quote and trailing period in str.isidentifier() docstring (GH-9756)
This rectifies commit ffc5a14d00.
2018-10-08 16:14:47 +05:30
Sanyam Khurana ffc5a14d00 bpo-33014: Clarify str.isidentifier docstring (GH-6088)
* bpo-33014: Clarify str.isidentifier docstring

* bpo-33014: Add code example in isidentifier documentation
2018-10-08 12:23:32 +05:30
Zackery Spytz ae62f01524 bpo-34910: Ensure that PyObject_Print() always returns -1 on error. (GH-9733) 2018-10-06 09:44:25 +03:00
Zackery Spytz 7bb9cd0a67 bpo-34899: Fix a possible assertion failure due to int_from_bytes_impl() (GH-9705)
The _PyLong_FromByteArray() call in int_from_bytes_impl() was
unchecked.
2018-10-06 00:02:23 +03:00
Zackery Spytz 96c5932794 bpo-34879: Fix a possible null pointer dereference in bytesobject.c (GH-9683)
formatfloat() was not checking if PyBytes_FromStringAndSize()
failed, which could lead to a null pointer dereference in
_PyBytes_FormatEx().
2018-10-03 09:01:30 +03:00
Victor Stinner e972c13624
bpo-30156: Remove property_descr_get() optimization (GH-9541)
property_descr_get() uses a "cached" tuple to optimize function
calls. But this tuple can be discovered in debug mode with
sys.getobjects(). Remove the optimization, it's not really worth it
and it causes 3 different crashes last years.

Microbenchmark:

./python -m perf timeit -v \
    -s "from collections import namedtuple; P = namedtuple('P', 'x y'); p = P(1, 2)" \
    --duplicate 1024 "p.x"

Result:

Mean +- std dev: [ref] 32.8 ns +- 0.8 ns -> [patch] 40.4 ns +- 1.3 ns: 1.23x slower (+23%)
2018-10-01 03:03:22 -07:00
INADA Naoki 2aaf98c16a bpo-34320: Fix dict(o) didn't copy order of dict subclass (GH-8624)
When dict subclass overrides order (`__iter__()`, `keys()`, and `items()`), `dict(o)`
should use it instead of dict ordering.


https://bugs.python.org/issue34320
2018-09-25 20:59:00 -07:00
Raymond Hettinger 73820a60cc
Fix compiler warning with a type cast (GH-9300) 2018-09-14 01:35:59 -07:00
Raymond Hettinger 00bc08ec11
Fix-up parenthesis, organization, and NULL check (GH-9297) 2018-09-14 01:00:11 -07:00
Lisa Roach 5ac704306f bpo-33073: Adding as_integer_ratio to ints. (GH-8750) 2018-09-13 23:56:23 -07:00
Benjamin Peterson e502451781
closes bpo-34646: Remove PyAPI_* macros from declarations. (GH-9218) 2018-09-12 12:06:42 -07:00
Sergey Fedoseev 6c7d67ce83 bpo-1621: Avoid signed integer overflow in set_table_resize(). (GH-9059)
Address a C undefined behavior signed integer overflow issue in set object table resizing.  Our -fwrapv compiler flag and practical reasons why sets are unlikely to get this large should mean this was never an issue but it was incorrect code that generates code analysis warnings.

<!-- issue-number: [bpo-1621](https://www.bugs.python.org/issue1621) -->
https://bugs.python.org/issue1621
<!-- /issue-number -->
2018-09-11 16:18:01 -07:00
Victor Stinner 998b806366
Revert "bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)" (GH-9187)
This reverts commit 886483e2b9.
2018-09-12 00:23:25 +02:00
Peter Eisentraut 0e0bc4e221 Fix misleading mentions of tp_size in comments (GH-9093)
Many type object initializations labeled a field "tp_size" in the
comment, but the name of that field is tp_basicsize.
2018-09-10 09:46:08 -07:00
Victor Stinner 886483e2b9
bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)
* Add %T format to PyUnicode_FromFormatV(), and so to
  PyUnicode_FromFormat() and PyErr_Format(), to format an object type
  name: equivalent to "%s" with Py_TYPE(obj)->tp_name.
* Replace Py_TYPE(obj)->tp_name with %T format in unicodeobject.c.
* Add unit test on %T format.
* Rename unicode_fromformat_write_cstr() to
  unicode_fromformat_write_utf8(), to make the intent more explicit.
2018-09-07 18:00:58 +02:00
jdemeyer 8f735485ac bpo-25750: fix refcounts in type_getattro() (GH-6118)
When calling tp_descr_get(self, obj, type), make sure that
we own a strong reference to "self".
2018-09-07 09:37:00 +02:00
Sergey Fedoseev 593bb30e82 closes bpo-34599: Improve performance of _Py_bytes_capitalize(). (GH-9083) 2018-09-06 21:54:49 -07:00
Victor Stinner 3d4226a832
bpo-34523: Support surrogatepass in locale codecs (GH-8995)
Add support for the "surrogatepass" error handler in
PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault()
for the UTF-8 encoding.

Changes:

* _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the
  surrogatepass error handler (_Py_ERROR_SURROGATEPASS).
* _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use
  the _Py_error_handler enum instead of "int surrogateescape" to pass
  the error handler. These functions now return -3 if the error
  handler is unknown.
* Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx()
  in test_codecs.
* Rename get_error_handler() to _Py_GetErrorHandler() and expose it
  as a private function.
* _freeze_importlib doesn't need config.filesystem_errors="strict"
  workaround anymore.
2018-08-29 22:21:32 +02:00
Victor Stinner b2457efc78
bpo-34523: Add _PyCoreConfig.filesystem_encoding (GH-8963)
_PyCoreConfig_Read() is now responsible to choose the filesystem
encoding and error handler. Using Py_Main(), the encoding is now
chosen even before calling Py_Initialize().

_PyCoreConfig.filesystem_encoding is now the reference, instead of
Py_FileSystemDefaultEncoding, for the Python filesystem encoding.

Changes:

* Add filesystem_encoding and filesystem_errors to _PyCoreConfig
* _PyCoreConfig_Read() now reads the locale encoding for the file
  system encoding.
* PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize()
  now use the interpreter configuration rather than
  Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors
  global configuration variables.
* Add _Py_SetFileSystemEncoding() and _Py_ClearFileSystemEncoding()
  private functions to only modify Py_FileSystemDefaultEncoding and
  Py_FileSystemDefaultEncodeErrors in coreconfig.c.
* _Py_CoerceLegacyLocale() now takes an int rather than
  _PyCoreConfig for the warning.
2018-08-29 13:25:36 +02:00
Alexey Izbyshev b57b4ac042 closes bpo-34504: Remove the useless NULL check in PySequence_Check(). (GH-8935)
Reported by Svace static analyzer.
2018-08-25 16:52:27 -07:00
Alexey Izbyshev 5f79b50763 closes bpo-34501: PyType_FromSpecWithBases: Check spec->name before dereferencing it. (GH-8930)
Reported by Svace static analyzer.
2018-08-25 11:53:47 -07:00
Alexey Izbyshev 8fdd331bbf closes bpo-34493: Objects/genobject.c: Add missing NULL check to compute_cr_origin() (GH-8911) 2018-08-25 00:15:23 -07:00
Alexey Izbyshev 7ecae3ca0b closes bpo-34468: Objects/rangeobject.c: Fix an always-false condition in range_repr() (GH-8880)
Also, propagate the error from PyNumber_AsSsize_t() because we don't care
only about OverflowError which is not reported if the second argument is NULL.

Reported by Svace static analyzer.
2018-08-23 21:39:45 -07:00
Alexey Izbyshev f6247aac08 closes bpo-34477: Objects/typeobject.c: Add missing NULL check to type_init() (GH-8876)
Reported by Svace static analyzer.
2018-08-23 21:22:16 -07:00
Alexey Izbyshev ccd9975267 bpo-34436: Fix check that disables overallocation for the last fmt specifier (GH-8826)
Reported by Svace static analyzer.
2018-08-23 09:50:52 +02:00
Alexey Izbyshev 74a307d48e bpo-34435: Add missing NULL check to unicode_encode_ucs1(). (GH-8823)
Reported by Svace static analyzer.
2018-08-19 21:52:04 +03:00
Zackery Spytz e349bf2358 bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (GH-8741)
The UTF-7 decoder now raises UnicodeDecodeError for ill-formed
sequences starting with "+" (as specified in RFC 2152).
2018-08-19 07:43:38 +03:00