Generally comparable perf for the "good" case where memchr doesn't
return any collisions (false matches on lower byte) but clearly faster
with collisions.
Each interpreter now has its own empty bytes string and single byte
character singletons.
Replace STRINGLIB_EMPTY macro with STRINGLIB_GET_EMPTY() macro.
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
(from the locale encoding), instead of decoding them implicitly from latin1
* Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
* Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
character if unicode is NULL
* Replace MIN/MAX macros by Py_MIN/Py_MAX
* stringlib/undef.h undefines STRINGLIB_IS_UNICODE
* stringlib/localeutil.h only supports Unicode
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77461 | antoine.pitrou | 2010-01-13 08:55:48 +0100 (mer., 13 janv. 2010) | 5 lines
Issue #7622: Improve the split(), rsplit(), splitlines() and replace()
methods of bytes, bytearray and unicode objects by using a common
implementation based on stringlib's fast search. Patch by Florent Xicluna.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72040 | eric.smith | 2009-04-27 15:04:37 -0400 (Mon, 27 Apr 2009) | 1 line
Issue #5793: rationalize isdigit / isalpha / tolower, etc. Will port to py3k. Should fix Windows buildbot errors.
........
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.
In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
The repr() of a string now contains printable Unicode characters unescaped.
The new ascii() builtin can be used to get a repr() with only ASCII characters in it.
PEP and patch were written by Atsuo Ishimoto.
svn+ssh://pythondev@svn.python.org/python/trunk
When forward porting this, I added _PyUnicode_InsertThousandsGrouping.
........
r63078 | eric.smith | 2008-05-11 15:52:48 -0400 (Sun, 11 May 2008) | 14 lines
Addresses issue 2802: 'n' formatting for integers.
Adds 'n' as a format specifier for integers, to mirror the same
specifier which is already available for floats. 'n' is the same as
'd', but inserts the current locale-specific thousands grouping.
I added this as a stringlib function, but it's only used by str type,
not unicode. This is because of an implementation detail in
unicode.format(), which does its own str->unicode conversion. But the
unicode version will be needed in 3.0, and it may be needed by other
code eventually in 2.6 (maybe decimal?), so I left it as a stringlib
implementation. As long as the unicode version isn't instantiated,
there's no overhead for this.
........
formatting.
Includes:
- Modifying tests for basic types to use __format__ methods, instead
of builtin "format".
- Adding PyObject_Format.
- General str/unicode cleanup discovered when backporting to 2.6.
- Removing datetimemodule.c's time_format, since it was identical
to date_format.
The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k. Any changes from here on should be made to trunk, and
changes will propogate to py3k).
Known issues:
The string.Formatter class, as discussed in the PEP, is incomplete.
Error handling needs to conform to the PEP.
Need to fix this warning that I introduced in Python/formatter_unicode.c:
Objects/stringlib/unicodedefs.h:26: warning: `STRINGLIB_CMP' defined but not used
Need to make sure sign formatting is correct, more tests needed.
Need to remove '()' sign formatting, left over from an earlier version of the PEP.