Issue #25401: Optimize bytes.fromhex() and bytearray.fromhex(): they are now
between 2x and 3.5x faster. Changes:
* Use a fast-path working on a char* string for ASCII string
* Use a slow-path for non-ASCII string
* Replace slow hex_digit_to_int() function with a O(1) lookup in
_PyLong_DigitValue precomputed table
* Use _PyBytesWriter API to handle the buffer
* Add unit tests to check the error position in error messages
Issue #25399: Don't create temporary bytes objects: modify _PyBytes_Format() to
create work directly on bytearray objects.
* Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something
outside CPython uses it
* _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so
bytearray_format() doesn't need tot create a temporary input bytes object
* Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to
_PyBytesWriter, to create a bytearray buffer instead of a bytes buffer
Most formatting operations are now between 2.5 and 5 times faster.
Issue #25274: sys.setrecursionlimit() now raises a RecursionError if the new
recursion limit is too low depending at the current recursion depth. Modify
also the "lower-water mark" formula to make it monotonic. This mark is used to
decide when the overflowed flag of the thread state is reset.
Don't require _PyBytesWriter pointer to be a "char *". Same change for
_PyBytesWriter_WriteBytes() parameter.
For example, binascii uses "unsigned char*".
Optimize bytes.__mod__(args) for integere formats: %d (%i, %u), %o, %x and %X.
_PyBytesWriter is now used to format directly the integer into the writer
buffer, instead of using a temporary bytes object.
Formatting is between 30% and 50% faster on a microbenchmark.
Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.
Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.
unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII.
Python.h header to fix a compilation error with OpenMP. PyThreadState_GET()
becomes an alias to PyThreadState_Get() to avoid ABI incompatibilies.
It is important that the _PyThreadState_Current variable is always accessed
with the same implementation of pyatomic.h. Use the PyThreadState_Get()
function so extension modules will all reuse the same implementation.
On Windows, the tv_sec field of the timeval structure has the type C long,
whereas it has the type C time_t on all other platforms. A C long has a size of
32 bits (signed inter, 1 bit for the sign, 31 bits for the value) which is not
enough to store an Epoch timestamp after the year 2038.
Add the _PyTime_AsTimevalTime_t() function written for datetime.datetime.now():
convert a _PyTime_t timestamp to a (secs, us) tuple where secs type is time_t.
It allows to support dates after the year 2038 on Windows.
Enhance also _PyTime_AsTimeval_impl() to detect overflow on the number of
seconds when rounding the number of microseconds.
On Windows, the tv_sec field of the timeval structure has the type C long,
whereas it has the type C time_t on all other platforms. A C long has a size of
32 bits (signed inter, 1 bit for the sign, 31 bits for the value) which is not
enough to store an Epoch timestamp after the year 2038.
Add the _PyTime_AsTimevalTime_t() function written for datetime.datetime.now():
convert a _PyTime_t timestamp to a (secs, us) tuple where secs type is time_t.
It allows to support dates after the year 2038 on Windows.
Enhance also _PyTime_AsTimeval_impl() to detect overflow on the number of
seconds when rounding the number of microseconds.
datetime.datetime now round microseconds to nearest with ties going to nearest
even integer (ROUND_HALF_EVEN), as round(float), instead of rounding towards
-Infinity (ROUND_FLOOR).
pytime API: replace _PyTime_ROUND_HALF_UP with _PyTime_ROUND_HALF_EVEN. Fix
also _PyTime_Divide() for negative numbers.
_PyTime_AsTimeval_impl() now reuses _PyTime_Divide() instead of reimplementing
rounding modes.
with ties going away from zero (ROUND_HALF_UP), as Python 2 and Python older
than 3.3, instead of rounding to nearest with ties going to nearest even
integer (ROUND_HALF_EVEN).
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee). Also, the in-lining was
having a negative effect on code generation for the callee.