Issue #25401: Optimize bytes.fromhex() and bytearray.fromhex(): they are now
between 2x and 3.5x faster. Changes:
* Use a fast-path working on a char* string for ASCII string
* Use a slow-path for non-ASCII string
* Replace slow hex_digit_to_int() function with a O(1) lookup in
_PyLong_DigitValue precomputed table
* Use _PyBytesWriter API to handle the buffer
* Add unit tests to check the error position in error messages
Issue #25399: Don't create temporary bytes objects: modify _PyBytes_Format() to
create work directly on bytearray objects.
* Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something
outside CPython uses it
* _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so
bytearray_format() doesn't need tot create a temporary input bytes object
* Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to
_PyBytesWriter, to create a bytearray buffer instead of a bytes buffer
Most formatting operations are now between 2.5 and 5 times faster.
Don't require _PyBytesWriter pointer to be a "char *". Same change for
_PyBytesWriter_WriteBytes() parameter.
For example, binascii uses "unsigned char*".
* Remove _PyBytes_FormatLong(): inline it into formatlong()
* the input type is always a long, so remove the code for bool
* don't duplicate the string if the length does not change
* Use PyUnicode_DATA() instead of _PyUnicode_AsString()
Addresses the float -> string conversion, using David Gay's code which
was added in Mark Dickinson's checkin r71663.
Also addresses these, which are intertwined with the short repr
changes:
- Issue #5772: format(1e100, '<') produces '1e+100', not '1.0e+100'
- Issue #5515: 'n' formatting with commas no longer works poorly
with leading zeros.
- PEP 378 Format Specifier for Thousands Separator: implemented
for floats.
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.
In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
Before the patch a lot of internal types weren't available in the header files. The patch exposes the new iterators, views and some other types to all C modules. I've also renamed some of the types and tp_names.
be answered with the comments removed.
There are many places that require checks when doing arithmetic for memory
sizes when allocating memory. Otherwise, overflow is possible with
a subsequent crash.
Fix SF #1777057 which was a result of not initializing the new BufferError
properly. Had to update the test for exceptions for BufferError too.
This uses up to 12.5% overallocation, not entirely unlike list_resize().
Could probably use more tweaks for odd allocation patterns, TBD.
Also add __alloc__() method which returns the actually allocated size.
PS. I'm now convinced that we need something like "".join(); later.