* Remove micro-optization:
(errors == "surrogateescape" || strcmp(errors, "surrogateescape") == 0).
Only use strcmp()
* Initialize 'arg' members in unicode_format_arg() to help the compiler to
diagnose real bugs and also make the code simpler to read
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
* Remove unicode_widen(): replaced with _PyUnicodeWriter_Prepare()
* Remove unicode_putchar(): replaced with
PyUnicodeWriter_Prepare() + PyUnicode_WRITER()
* When handling an decoding error, only overallocate the buffer by +25%
instead of +100%
This commit rewrites the docstring for int() to incorporate the documentation
changes made in issue #16036. It also switches the docstrings for int(),
str(), range(), and slice() to use multi-line signatures.
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
- Use _PyLong_FormatWriter() instead of formatlong() when possible, to avoid
a temporary buffer
- Enable the fast path when width is smaller or equals to the length,
and when the precision is bigger or equals to the length
- Add unit tests!
- formatlong() uses PyUnicode_Resize() instead of _PyUnicode_FromASCII()
to resize the output string