cpython

Commit Graph

Author	SHA1	Message	Date
Serhiy Storchaka	4fdb68491e	Issue #22896 : Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer() and PyObject_AsWriteBuffer().	2015-02-03 01:21:08 +02:00
Serhiy Storchaka	b757c83ec6	Issue #22581 : Use more "bytes-like object" throughout the docs and comments.	2014-12-05 22:25:22 +02:00
Benjamin Peterson	1cc9520327	s/stringobject/bytesobject/ (closes #22036 ) Patch by Martin Matusiak.	2014-07-23 21:39:37 -07:00
Benjamin Peterson	d455ce4fd4	merge 3.3	2014-03-30 19:52:39 -04:00
Benjamin Peterson	0ad6098b67	merge 3.2	2014-03-30 19:52:22 -04:00
Benjamin Peterson	23cf403ca1	fix expandtabs overflow detection to be consistent and not rely on signed overflow	2014-03-30 19:47:57 -04:00
Serhiy Storchaka	3079328d29	Reverted changeset b72c5573c5e7 (issue #15027 ).	2014-01-04 22:44:01 +02:00
Serhiy Storchaka	583a93943c	Issue #15027 : Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.	2014-01-04 19:25:37 +02:00
Benjamin Peterson	0ee22bf774	fix format spec recursive expansion (closes #19729 )	2013-11-26 19:22:36 -06:00
Serhiy Storchaka	dc2fd5101a	Remove dead code committed in issue #12892 .	2013-11-19 15:56:05 +02:00
Serhiy Storchaka	58cf607d13	Issue #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates. The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.	2013-11-19 11:32:41 +02:00
Ezio Melotti	745d54d2fa	#17806 : Added keyword-argument support for "tabsize" to str/bytes.expandtabs().	2013-11-16 19:10:57 +02:00
Victor Stinner	cc64eb5b9f	Issue #18408 : Fix bytearrayiter.partition()/rpartition(), handle PyByteArray_FromStringAndSize() failure (ex: on memory allocation failure)	2013-10-29 03:15:37 +01:00
Serhiy Storchaka	8fa8ee3970	Issue #18701 : Remove support of old CPython versions (<3.0) from C code.	2013-08-17 00:48:02 +03:00
Raymond Hettinger	d06eeb4a24	merge	2013-08-13 18:20:55 -07:00
Raymond Hettinger	b1b915c796	Issue 18719: Remove a false optimization Remove an unused early-out test from the critical path for dict and set lookups. When the strings already have matching lengths, kinds, and hashes, there is no additional information gained by checking the first characters (the probability of a mismatch is already known to be less than 1 in 2**64).	2013-08-13 18:16:34 -07:00
Antoine Pitrou	9ed5f27266	Issue #18722 : Remove uses of the "register" keyword in C code.	2013-08-13 20:18:52 +02:00
Benjamin Peterson	d2b58a9880	only recursively expand in the format spec (closes #17644 )	2013-05-17 17:34:30 -05:00
Benjamin Peterson	4d94474ba3	rewrite the parsing of field names to be more consistent wrt recursive expansion	2013-05-17 18:22:31 -05:00
Benjamin Peterson	48953632df	merge 3.3	2013-05-17 17:35:28 -05:00
Ezio Melotti	5263c13801	Merge removal of trailing whitespace from 3.3.	2013-04-21 04:08:18 +03:00
Ezio Melotti	6b02772c13	Remove trailing whitespace.	2013-04-21 04:07:51 +03:00
Victor Stinner	8f674ccd64	Close #17694 : Add minimum length to _PyUnicodeWriter * Add also min_char attribute to _PyUnicodeWriter structure (currently unused) * _PyUnicodeWriter_Init() has no more argument (except the writer itself): min_length and overallocate must be set explicitly * In error handlers, only enable overallocation if the replacement string is longer than 1 character * CJK decoders don't use overallocation anymore * Set min_length, instead of preallocating memory using _PyUnicodeWriter_Prepare(), in many decoders * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow	2013-04-17 23:02:17 +02:00
Victor Stinner	76b3b2726c	stringlib: remove unused STRINGLIB_RESIZE macro	2013-04-14 16:29:09 +02:00
Serhiy Storchaka	e2cef885a2	Issue #16061 : Speed up str.replace() for replacing 1-character strings.	2013-04-13 22:45:04 +03:00
Victor Stinner	7efa3b8242	Close #13126 : "Simplify" FASTSEARCH() code to help the compiler to emit more efficient machine code. Patch written by Antoine Pitrou. Without this change, str.find() was 10% slower than str.rfind() in the worst case.	2013-04-08 00:26:43 +02:00
Victor Stinner	cfc4c13b04	Add _PyUnicodeWriter_WriteSubstring() function Write a function to enable more optimizations: * If the substring is the whole string and overallocation is disabled, just keep a reference to the string, don't copy characters * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when possible	2013-04-03 01:48:39 +02:00
Serhiy Storchaka	06b16f879f	Remove unused defines.	2013-02-23 14:49:09 +02:00
Serhiy Storchaka	18809fa94e	Remove unused defines.	2013-02-23 14:48:16 +02:00
Antoine Pitrou	4de7457009	Issue #17173 : Remove uses of locale-dependent C functions (isalpha() etc.) in the interpreter. I've left a couple of them in: zlib (third-party lib), getaddrinfo.c (doesn't include Python.h, and probably obsolete), _sre.c (legitimate use for the re.LOCALE flag).	2013-02-09 23:11:27 +01:00
Serhiy Storchaka	b946af5897	Check for NULL before the pointer aligning in fastsearch_memchr_1char. There is no guarantee that NULL is aligned.	2013-01-15 13:32:41 +02:00
Serhiy Storchaka	18ba40b945	Check for NULL before the pointer aligning in fastsearch_memchr_1char. There is no guarantee that NULL is aligned.	2013-01-15 13:27:28 +02:00
Christian Heimes	5f7e8dab11	Issue #16592 : stringlib_bytes_join doesn't raise MemoryError on allocation failure	2012-12-02 07:56:42 +01:00
Victor Stinner	6caa6fb535	(Merge 3.3) Issue #8271 : Fix compilation on Windows	2012-11-05 00:00:50 +01:00
Victor Stinner	ab60de478d	Issue #8271 : Fix compilation on Windows	2012-11-04 23:59:15 +01:00
Ezio Melotti	cfa9636404	#8271 : merge with 3.3.	2012-11-04 23:23:09 +02:00
Ezio Melotti	f7ed5d111b	#8271 : the utf-8 decoder now outputs the correct number of U+FFFD characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.	2012-11-04 23:21:38 +02:00
Antoine Pitrou	6f7b0da6bc	Issue #12805 : Make bytes.join and bytearray.join faster when the separator is empty. Patch by Serhiy Storchaka.	2012-10-20 23:08:34 +02:00
Christian Heimes	743e0cd6b5	Issue #16166 : Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified endianess detection and handling.	2012-10-17 23:52:17 +02:00
Antoine Pitrou	cfc22b4a9b	Issue #15958 : bytes.join and bytearray.join now accept arbitrary buffer objects.	2012-10-16 21:07:23 +02:00
Antoine Pitrou	ca8aa4acf6	Issue #15144 : Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.	2012-09-20 20:56:47 +02:00
Victor Stinner	b3f5501250	Close #15534 : Fix a typo in the fast search function of the string library (_s => s) Replace _s with ptr to avoid future confusion. Add also non regression tests.	2012-08-02 23:05:01 +02:00
Mark Dickinson	fb90c0934c	Issue #14700 : Fix buggy overflow checks for large precision and width in new-style and old-style formatting.	2012-10-28 10:18:03 +00:00
Mark Dickinson	01ac8b6ab1	Use correct types for ASCII_CHAR_MASK integer constants.	2012-07-07 14:08:48 +02:00
Mark Dickinson	106c4145ff	Issue #14923 : Optimize continuation-byte check in UTF-8 decoding. Patch by Serhiy Storchaka.	2012-06-23 21:45:14 +01:00
Antoine Pitrou	a759d4e9f4	Make private function static (from `make smelly`)	2012-06-21 17:26:28 +02:00
Antoine Pitrou	27f6a3b0bf	Issue #15026 : utf-16 encoding is now significantly faster (up to 10x). Patch by Serhiy Storchaka.	2012-06-15 22:15:23 +02:00
Victor Stinner	d7b7c7472b	Issue #14993 : Use standard "unsigned char" instead of a unsigned char bitfield	2012-06-04 22:52:12 +02:00
Victor Stinner	d3f0882dfb	Issue #14744 : Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.	2012-05-29 12:57:52 +02:00
Antoine Pitrou	63065d761e	Issue #14624 : UTF-16 decoding is now 3x to 4x faster on various inputs. Patch by Serhiy Storchaka.	2012-05-15 23:48:04 +02:00
Antoine Pitrou	ca5f91b888	Issue #14738 : Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka.	2012-05-10 16:36:02 +02:00
Victor Stinner	3b1a74a9c3	Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"	2012-05-09 22:25:00 +02:00
Victor Stinner	ee4544c920	Issue #14744 : Inline unicode_writer_write_char() and unicode_write_str() Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once per argument.	2012-05-09 22:24:08 +02:00
Victor Stinner	202fdca133	Close #14716 : str.format() now uses the new "unicode writer" API instead of the PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.	2012-05-07 12:47:02 +02:00
Antoine Pitrou	d0acb411ef	Issue #14387 : Do not include accu.h from Python.h.	2012-03-22 14:42:18 +01:00
Victor Stinner	41a863cb81	Issue #13706 : Fix format(int, "n") for locale with non-ASCII thousands separator * Decode thousands separator and decimal point using PyUnicode_DecodeLocale() (from the locale encoding), instead of decoding them implicitly from latin1 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum character if unicode is NULL * Replace MIN/MAX macros by Py_MIN/Py_MAX * stringlib/undef.h undefines STRINGLIB_IS_UNICODE * stringlib/localeutil.h only supports Unicode	2012-02-24 00:37:51 +01:00
Benjamin Peterson	21e0da228d	remove some usage of Py_UNICODE_TOUPPER/LOWER	2012-01-11 21:00:42 -05:00
Victor Stinner	6099a03202	Issue #13624 : Write a specialized UTF-8 encoder to allow more optimization The main bottleneck was the PyUnicode_READ() macro.	2011-12-18 14:22:26 +01:00
Victor Stinner	f8eac00779	Issue #13623 : Fix a performance regression introduced by issue #12170 in bytes.find() and handle correctly OverflowError (raise the same ValueError than the error for -1).	2011-12-18 01:17:41 +01:00
Victor Stinner	b37b17423b	Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0) Create an empty string with the new Unicode API.	2011-12-01 03:18:59 +01:00
Antoine Pitrou	0a3229de6b	Issue #13417 : speed up utf-8 decoding by around 2x for the non-fully-ASCII case. This almost catches up with pre-PEP 393 performance, when decoding needed only one pass.	2011-11-21 20:39:13 +01:00
Victor Stinner	0fc35196bb	stringlib: remove unused STRINGLIB_FILL	2011-11-20 19:30:15 +01:00
Victor Stinner	7931d9a951	Replace PyUnicodeObject type by PyObject * _PyUnicode_CheckConsistency() now takes a PyObject* instead of void* * Remove now useless casts to PyObject*	2011-11-04 00:22:48 +01:00
Victor Stinner	9db1a8b69f	Replace PyUnicodeObject* by PyObject* where it was irrevelant A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to PyUnicodeObject* is wrong	2011-10-23 20:04:37 +02:00
Antoine Pitrou	ac65d96777	Issue #12170 : The count(), find(), rfind(), index() and rindex() methods of bytes and bytearray objects now accept an integer between 0 and 255 as their first argument. Patch by Petri Lehtinen.	2011-10-20 23:54:17 +02:00
Antoine Pitrou	5b9f4c1539	Fix typo	2011-10-17 19:21:04 +02:00
Antoine Pitrou	c198d0599b	Add a comment explaining this heuristic.	2011-10-13 18:07:37 +02:00
Antoine Pitrou	dda339e6d2	Simplify heuristic for when to use memchr	2011-10-13 17:58:11 +02:00
Antoine Pitrou	dd4e2f0153	Issue #13155 : Optimize finding the optimal character width of an unicode string	2011-10-13 00:02:27 +02:00
Victor Stinner	d218bf14cc	stringlib: Fix STRINGLIB_STR for UCS2/UCS4	2011-10-12 00:14:32 +02:00
Victor Stinner	8cc70dcf70	Fix fastsearch for UCS2 and UCS4 * If needle is 0, try (p[0] >> 16) & 0xff for UCS4 * Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4	2011-10-11 23:22:22 +02:00
Antoine Pitrou	2c3b2302ad	Issue #13134 : optimize finding single-character strings using memchr	2011-10-11 20:29:21 +02:00
Martin v. Löwis	c47adb04b3	Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE.	2011-10-07 20:55:35 +02:00
Antoine Pitrou	4574e62c6e	Fix massive slowdown in string formatting with str.format. Example: ./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)" -> before: 547 usec per loop -> after: 13 usec per loop -> 3.2: 22.5 usec per loop -> 2.7: 12.6 usec per loop	2011-10-07 02:26:47 +02:00
Antoine Pitrou	dbf697ae5c	Fix compilation warnings under 64-bit Windows	2011-10-06 15:34:41 +02:00
Victor Stinner	c3cec7868b	Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII ucs1, ucs2 and ucs4 libraries have to scan created substring to find the maximum character, whereas it is not need to ASCII strings. Because ASCII strings are common, it is useful to optimize ASCII.	2011-10-05 21:24:08 +02:00
Victor Stinner	e57b1c0da1	Mark PyUnicode_FromUCS[124] as private	2011-09-28 22:20:48 +02:00
Martin v. Löwis	d63a3b8beb	Implement PEP 393.	2011-09-28 07:41:54 +02:00
Mark Dickinson	c7d93b7614	Issue #1621 : Fix undefined behaviour from signed overflow in datetime module hashes, array and list iterations, and get_integer (stringlib/string_format.h)	2011-09-25 15:34:32 +01:00
Mark Dickinson	36f27c995a	Issue #1621 : Fix undefined behaviour from signed overflow in get_integer (stringlib/formatter.h)	2011-09-24 19:11:53 +01:00
Eric V. Smith	12ebefc9d3	Closes #12579 . Positional fields with str.format_map() now raise a ValueError instead of SystemError.	2011-07-18 14:03:41 -04:00
Jesus Cea	6159ee3cf5	MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828 )	2011-04-20 17:42:50 +02:00
Jesus Cea	ac4515063c	startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828 )	2011-04-20 17:09:23 +02:00
Ezio Melotti	4969f709cc	#11515 : Merge with 3.1.	2011-03-15 05:59:46 +02:00
Ezio Melotti	42da663e6f	#11515 : fix several typos. Patch by Piotr Kasprzyk.	2011-03-15 05:18:48 +02:00
Eric Smith	a1eac7218b	Issue #11302 : missing type check on _string.formatter_field_name_split and _string.formatter_parser caused crash. Originial patch by haypo, reviewed by me, okayed by Georg.	2011-01-29 11:15:35 +00:00
Eric Smith	984bb58000	Issue #7094 : Add alternate ('#') flag to __format__ methods for float, complex and Decimal. Allows greater control over when decimal points appear. Added to make transitioning from %-formatting easier. '#g' still has a problem with Decimal which I'll fix soon.	2010-11-25 16:08:06 +00:00
Antoine Pitrou	a277ec4ad9	Followup to r86170: fix reference leak in str.format	2010-11-05 12:23:55 +00:00
Eric Smith	27bbca6f79	Issue #6081 : Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict.	2010-11-04 17:06:58 +00:00
Georg Brandl	66c221e993	#9418 : first step of moving private string methods to _string module.	2010-10-14 07:04:07 +00:00
Florent Xicluna	eb6f3ead00	Fix #8530 : Prevent stringlib fastsearch from reading beyond the front of an array.	2010-08-08 22:07:16 +00:00
Mark Dickinson	388122d43b	Issue #9337 : Make float.__str__ identical to float.__repr__. (And similarly for complex numbers.)	2010-08-04 20:56:28 +00:00
Mark Dickinson	fc070313dd	Merged revisions 83400 via svnmerge from svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r83400 \| mark.dickinson \| 2010-08-01 11:41:49 +0100 (Sun, 01 Aug 2010) \| 7 lines Issue #9416: Fix some issues with complex formatting where the output with no type specifier failed to match the str output: - format(complex(-0.0, 2.0), '-') omitted the real part from the output, - format(complex(0.0, 2.0), '-') included a sign and parentheses. ........	2010-08-01 10:43:42 +00:00
Mark Dickinson	5b65df7ce2	Issue #9416 : Fix some issues with complex formatting where the output with no type specifier failed to match the str output: - format(complex(-0.0, 2.0), '-') omitted the real part from the output, - format(complex(0.0, 2.0), '-') included a sign and parentheses.	2010-08-01 10:41:49 +00:00
Benjamin Peterson	3b107f99c7	remove unneeded error check	2010-07-11 03:36:35 +00:00
Benjamin Peterson	99bcf5ce08	Merged revisions 81823,81835 via svnmerge from svn+ssh://pythondev@svn.python.org/python/branches/py3k ................ r81823 \| benjamin.peterson \| 2010-06-07 17:31:26 -0500 (Mon, 07 Jun 2010) \| 9 lines Merged revisions 81820 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81820 \| benjamin.peterson \| 2010-06-07 17:23:23 -0500 (Mon, 07 Jun 2010) \| 1 line correctly overflow when indexes are too large ........ ................ r81835 \| benjamin.peterson \| 2010-06-08 09:57:22 -0500 (Tue, 08 Jun 2010) \| 9 lines Merged revisions 81834 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81834 \| benjamin.peterson \| 2010-06-08 09:53:29 -0500 (Tue, 08 Jun 2010) \| 1 line kill extra word ........ ................	2010-06-08 15:12:17 +00:00
Benjamin Peterson	504b6e8115	Merged revisions 81824 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81824 \| benjamin.peterson \| 2010-06-07 17:32:44 -0500 (Mon, 07 Jun 2010) \| 1 line remove extra byte and fix comment ........	2010-06-07 22:35:08 +00:00
Benjamin Peterson	59a1b2f732	Merged revisions 81820 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81820 \| benjamin.peterson \| 2010-06-07 17:23:23 -0500 (Mon, 07 Jun 2010) \| 1 line correctly overflow when indexes are too large ........	2010-06-07 22:31:26 +00:00
Benjamin Peterson	d240071cd8	Merged revisions 81813 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81813 \| benjamin.peterson \| 2010-06-07 16:37:09 -0500 (Mon, 07 Jun 2010) \| 2 lines locale grouping strings should end in '\0' ........	2010-06-07 21:41:35 +00:00
Antoine Pitrou	7f14f0d8a0	Recorded merge of revisions 81032 via svnmerge from svn+ssh://pythondev@svn.python.org/python/branches/py3k ................ r81032 \| antoine.pitrou \| 2010-05-09 17:52:27 +0200 (dim., 09 mai 2010) \| 9 lines Recorded merge of revisions 81029 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81029 \| antoine.pitrou \| 2010-05-09 16:46:46 +0200 (dim., 09 mai 2010) \| 3 lines Untabify C files. Will watch buildbots. ........ ................	2010-05-09 16:14:21 +00:00

1 2 3 4 5

241 Commits