str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.
The latter mechanism remains in place for third party non-text
encodings.
- output type errors now redirect users to the type-neutral
convenience functions in the codecs module
- stateless errors that occur during encoding and decoding
will now be automatically wrapped in exceptions that give
the name of the codec involved
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.
Add also _PyId_builtins identifier for "builtins" common string.
instead of creating temporary Unicode string objects
Add also more identifiers in pythonrun.c to avoid temporary Unicode string
objets for the interactive interpreter.
- don't call PyErr_NoMemory with interpreter is not initialised
- note that it's OK to call _PyMem_RawStrDup here
- don't include this in the limited API
- capitalise "IO"
- be explicit that a non-zero return indicates an error
- include versionadded marker in docs
This new pre-initialization API allows embedding
applications like Blender to force a particular
encoding and error handler for the standard IO streams.
Also refactors Modules/_testembed.c to let us start
testing multiple embedding scenarios.
(Initial patch by Bastien Montagne)
The setobject freelist was consuming memory but not providing much value.
Even when a freelisted setobject was available, most of the setobject
fields still needed to be initialized and the small table still required
a memset(). This meant that the custom freelisting scheme for sets was
providing almost no incremental benefit over the default Python freelist
scheme used by _PyObject_Malloc() in Objects/obmalloc.c.
-I
Run Python in isolated mode. This also implies -E and -s. In isolated mode
sys.path contains neither the script’s directory nor the user’s
site-packages directory. All PYTHON* environment variables are ignored,
too. Further restrictions may be imposed to prevent the user from
injecting malicious code.
PyStructSequence_InitType() except that it has a return value (0 on success,
-1 on error).
* PyStructSequence_InitType2() now raises MemoryError on memory allocation failure
* Fix also some calls to PyDict_SetItemString(): handle error
Add new enum:
* PyMemAllocatorDomain
Add new structures:
* PyMemAllocator
* PyObjectArenaAllocator
Add new functions:
* PyMem_RawMalloc(), PyMem_RawRealloc(), PyMem_RawFree()
* PyMem_GetAllocator(), PyMem_SetAllocator()
* PyObject_GetArenaAllocator(), PyObject_SetArenaAllocator()
* PyMem_SetupDebugHooks()
Changes:
* PyMem_Malloc()/PyObject_Realloc() now always call malloc()/realloc(), instead
of calling PyObject_Malloc()/PyObject_Realloc() in debug mode.
* PyObject_Malloc()/PyObject_Realloc() now falls back to
PyMem_Malloc()/PyMem_Realloc() for allocations larger than 512 bytes.
* Redesign debug checks on memory block allocators as hooks, instead of using C
macros
* Add a new PyMemAllocators structure
* New functions:
- PyMem_RawMalloc(), PyMem_RawRealloc(), PyMem_RawFree(): GIL-free memory
allocator functions
- PyMem_GetRawAllocators(), PyMem_SetRawAllocators()
- PyMem_GetAllocators(), PyMem_SetAllocators()
- PyMem_SetupDebugHooks()
- _PyObject_GetArenaAllocators(), _PyObject_SetArenaAllocators()
* Add unit test for PyMem_Malloc(0) and PyObject_Malloc(0)
* Add unit test for new get/set allocators functions
* PyObject_Malloc() now falls back on PyMem_Malloc() instead of malloc() if
size is bigger than SMALL_REQUEST_THRESHOLD, and PyObject_Realloc() falls
back on PyMem_Realloc() instead of realloc()
* PyMem_Malloc() and PyMem_Realloc() now always call malloc() and realloc(),
instead of calling PyObject_Malloc() and PyObject_Realloc() in debug mode
Forgot to raise ModuleNotFoundError when None is found in sys.modules.
This led to introducing the C function PyErr_SetImportErrorSubclass()
to make setting ModuleNotFoundError easier.
Also updated the reference docs to mention ModuleNotFoundError
appropriately. Updated the docs for ModuleNotFoundError to mention the
None in sys.modules case.
Lastly, it was noticed that PyErr_SetImportError() was not setting an
exception when returning None in one case. That issue is now fixed.
ImportError.
The exception is raised by import when a module could not be found.
Technically this is defined as no viable loader could be found for the
specified module. This includes ``from ... import`` statements so that
the module usage is consistent for all situations where import
couldn't find what was requested.
This should allow for the common idiom of::
try:
import something
except ImportError:
pass
to be updated to using ModuleNotFoundError and not accidentally mask
ImportError messages that should propagate (e.g. issues with a
loader).
This work was driven by the fact that the ``from ... import``
statement needed to be able to tell the difference between an
ImportError that simply couldn't find a module (and thus silence the
exception so that ceval can raise it) and an ImportError that
represented an actual problem.
Note that this is a potentially disruptive change since it may
release some system resources which would otherwise remain
perpetually alive (e.g. database connections kept in thread-local
storage).
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
* _PyUnicodeWriter_Init() has no more argument (except the writer itself):
min_length and overallocate must be set explicitly
* In error handlers, only enable overallocation if the replacement string
is longer than 1 character
* CJK decoders don't use overallocation anymore
* Set min_length, instead of preallocating memory using
_PyUnicodeWriter_Prepare(), in many decoders
* _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
Write a function to enable more optimizations:
* If the substring is the whole string and overallocation is disabled, just
keep a reference to the string, don't copy characters
* Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
possible
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent. We could work to get rid of the -fwrapv requirement
in 3.4 but that requires more planning.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
longer required as of Python 2.5+ when the gc_refs changed from an int (4
bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes.
The use of a 'long double' triggered a warning by Clang trunk's
Undefined-Behavior Sanitizer as on many platforms a long double requires
16-byte alignment but the Python memory allocator only guarantees 8 byte
alignment.
So our code would allocate and use these structures with technically improper
alignment. Though it didn't matter since the 'dummy' field is never used.
This silences that warning.
Spelunking into code history, the double was added in 2001 to force better
alignment on some platforms and changed to a long double in 2002 to appease
Tru64. That issue should no loner be present since the upgrade from int to
Py_ssize_t where the minimum structure size increased to 16 (unless anyone
knows of a platform where ssize_t is 4 bytes?) or 24 bytes depending on if the
build uses 4 or 8 byte pointers.
We can probably get rid of the double and this union hack all together today.
That is a slightly more invasive change that can be left for later.
A more correct non-hacky alternative if any alignment issues are still found
would be to use a compiler specific alignment declaration on the structure and
determine which value to use at configure time.
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
Previously, excessive nesting in expressions would blow the
stack and segfault the interpreter. Now, a hard limit based
on the configured recursion limit and a hardcoded scaling
factor is applied.
... (unsigned long and unsigned int) to avoid an undefined behaviour with
Py_TPFLAGS_TYPE_SUBCLASS ((1 << 31). PyType_GetFlags() result type is now
unsigned too (unsigned long, instead of long).
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
sporadic crashes in multi-thread programs when several long deallocator
chains ran concurrently and involved subclasses of built-in container
types.
Note that the trashcan functions are part of the stable ABI, therefore
they have to be kept around for binary compatibility of extensions.
sporadic crashes in multi-thread programs when several long deallocator
chains ran concurrently and involved subclasses of built-in container
types.
Because of this change, a couple extension modules compiled for 3.2.4
(those which use the trashcan mechanism, despite it being undocumented)
will not be loadable by 3.2.3 and earlier. However, extension modules
compiled for 3.2.3 and earlier will be loadable by 3.2.4.
sporadic crashes in multi-thread programs when several long deallocator
chains ran concurrently and involved subclasses of built-in container
types.
Because of this change, a couple extension modules compiled for 3.2.4
(those which use the trashcan mechanism, despite it being undocumented)
will not be loadable by 3.2.3 and earlier. However, extension modules
compiled for 3.2.3 and earlier will be loadable by 3.2.4.
PyImport_ImportModuleLevel() with a 'level' of 0 instead of -1 as the
latter is no longer a valid value.
Also added a versionchanged note for PyImport_ImportModuleLevel() just
in case people don't make the connection between changes to
__import__() and this C function.
Fix also its value on Windows and Linux according to its documentation:
"adjustable" indicates if the clock *can be* adjusted, not if it is or was
adjusted.
In most cases, it is not possible to indicate if a clock is or was adjusted.
* Formatting string, int, float and complex use the _PyUnicodeWriter API. It
avoids a temporary buffer in most cases.
* Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
keep a reference to the string if the output is only composed of one string
* Disable overallocation when formatting the last argument of str%args and
str.format(args)
* Overallocation allocates at least 100 characters: add min_length attribute
to the _PyUnicodeWriter structure
* Add new private functions: _PyUnicode_FastCopyCharacters(),
_PyUnicode_FastFill() and _PyUnicode_FromASCII()
The speed up is around 20% in average.
Removed futimens as it is now redundant.
Changed shutil.copystat to use st_atime_ns and st_mtime_ns from os.stat
and ns= parameter to utime--it once again preserves exact metadata on Linux!
* Rename time.steady() to time.monotonic()
* On Windows, time.monotonic() uses GetTickCount/GetTickCount64() instead of
QueryPerformanceCounter()
* time.monotonic() uses CLOCK_HIGHRES if available
* Add time.get_clock_info(), time.perf_counter() and time.process_time()
functions
* Remove _PyBytes_FormatLong(): inline it into formatlong()
* the input type is always a long, so remove the code for bool
* don't duplicate the string if the length does not change
* Use PyUnicode_DATA() instead of _PyUnicode_AsString()
importlib._bootstrap is now frozen into Python/importlib.h and stored
as _frozen_importlib in sys.modules. Py_Initialize() loads the frozen
code along with sys and imp and then uses _frozen_importlib._install()
to set builtins.__import__() w/ _frozen_importlib.__import__().
Currently import does not use these attributes as they are planned
for use by importlib (which will be another commit).
Thanks to Filip Gruszczyński for the initial patch and Brian Curtin
for refining it.
This allows generators that are using yield from to be seen by debuggers. It
also kills the f_yieldfrom field on frame objects.
Patch mostly from Mark Shannon with a few tweaks by me.
time.ctime(), gmtime(), time.localtime(), datetime.date.fromtimestamp(),
datetime.datetime.fromtimestamp() and datetime.datetime.utcfromtimestamp() now
raises an OverflowError, instead of a ValueError, if the timestamp does not fit
in time_t.
datetime.datetime.fromtimestamp() and datetime.datetime.utcfromtimestamp() now
round microseconds towards zero instead of rounding to nearest with ties going
away from zero.
and lifetime issues of dynamically allocated Py_buffer members (#9990)
as well as crashes (#8305, #7433). Many new features have been added
(See whatsnew/3.3), and the documentation has been updated extensively.
The ndarray test object from _testbuffer.c implements all aspects of
PEP-3118, so further development towards the complete implementation
of the PEP can proceed in a test-driven manner.
Thanks to Nick Coghlan, Antoine Pitrou and Pauli Virtanen for review
and many ideas.
- Issue #12834: Fix incorrect results of memoryview.tobytes() for
non-contiguous arrays.
- Issue #5231: Introduce memoryview.cast() method that allows changing
format and shape without making a copy of the underlying memory.
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
(from the locale encoding), instead of decoding them implicitly from latin1
* Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
* Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
character if unicode is NULL
* Replace MIN/MAX macros by Py_MIN/Py_MAX
* stringlib/undef.h undefines STRINGLIB_IS_UNICODE
* stringlib/localeutil.h only supports Unicode
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.
The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.
The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.